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ABSTRACT 


Svedlow, Martin, Ph.D., Purdue University, August 1976, Analytical 
and Experimental Design and Analysis of an Optimal Processor for Image 
Registration. Major Professor: Clare D. McGillem. 

The registration of temporally differing images is defined in a way 
that allows its analysis via parameter estimation theory. Assuming 
spatial congruence between the images, one image is defined as the signal 
and the second image Is assumed to be the signal plus additive noise, 
where the noise is comprised of the temporal changes and any additional 
noise introduced by the sensor system. The parameters to be estimated 
are the relative translations between the images. 

With this formulation, a quantitative measure of the performance of 
the registration processor is defined which leads to the derivation of an 
optimum processor' that yields the best possible performance in terms of 
the criteria chosen. The performance measure used is the variance of the 
registration error, where the error is the spatial difference between the . 
registration position indicated by the processor in the presence of 
noise and thetrue overlay location. With this performance criterion 
the optimum processor is that which minimizes the variance of the regis- 
tration error. Derivation of the processor which satisfies this criterion 
shows it to be the matched filter, which also maximizes the output signal- 
to-noise ratio. Substitution of this processor into the general expres- 
sion for the variance of the registration error yields a compact 
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expression in terms, of the effective bandwidth and signal-to'-noise ratio. 

Given the. matched filter processor, two methods of implementation 
are shown. In the- first approach the second image (signal plus noise) is 
passed through a single filter where the position at which- the output is 
a maximum is taken as the indicated registration position. For the 
second' technique of implementing the matched filter both of the images 
are passed through a prewhitening filter and the resulting outputs are 
crosscorrel ated , where the prewhitening filter is designed so as to pre- 
whiten the input noise (temporal changes). Again the indicated regis- 
tration position is the location of the maximum- of the output. This 
second method is the one that finds itself applicable to image registra- 
tion algorithms that have been implemented by other investigators in the 
form of utilizing a preprocessing operation on the images prior to over- 
laying them. 

Actual- determination of the matched filter processor to be- used in 
a particular situation requires a model of the- autocorrelation function 
of the noise (temporal changes). For application of' this type- of regis- 
tration processor to LANDSAT I satellite imagery an estimation procedure 
for determining a model- of the autocorrelation function is carried out. 
ft is found that for the imagery examined, the autocorrelation function 
o,f the temporal changes is of an exponential form. Utilizing this para- 
metric form for the autocorrelation function an example is presented in 
which the matched filter is evaluated. It is found that with the pre- 
whitening' fol lowed by crosscorrelatlon approach to implementation of the 
matched fil'ter, that the preprocessing or prewhitening f i 1 ter appl i ed to 
both images is a derivative type operator. This result indicates that a 
derivative type preprocessor should be applied to both images- prior to 
overlay? ng- them. 
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One of the basic assumptions made in the derivation of the optimum 
processor is that_the images be spatially congruent. However, this is 
not necessarily true when given two sets of actual imagery due to unavoid- 
able perturbations in the scanner viewing platform orientation. An 
analysis is presented in which the loss in the output signal-to-noise 
ratio due to the violation of the spatial congruency assumption is shown. 
The results 'show that in the presence of relative spatial distortions 
which are increasing in image. size, such as a linear scale change, that 
there is an optimum integration area size for the cross correlation stage 
of the processor which yields a maximum output signal.-to-noise ratio. 
Determination of this optimum integration area size is a straightforward 
procedure which' is shown in a series of examples illustrating several 
types of relative spatial distortions. In two of the examples, models 
of the distortions observed between temporally differing- LANDSAT ! images 
were used. In this way it is shown how the optimum integration area 
size for the registration of images in practice may be found in a straight- 
forward manner. i 

Finally,' an experimental comparison of the techniques used in ■ • 
several registration algorithms proposed or implemented by other investi- 
gators is presented. This study provides both an objecti ve .compari son 
of the different algorithms plus a corroboration of the analytical re- 
sults derived in the earlier sections. It is found that preprocessing 
the images via a gradient type operator, which Is a derivative type 
operation, improved the overall performance of the registration processor. 
This agrees with the preprocessing stage of the optimum filter derived In 
the application of the matched filter to the registration of images where 
the preprocessing filter is found to be a derivative type operator. 
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CHAPTER 1 
INTRODUCTION 

1.1. Genera] Discussion 

Image registration is a topic that has become important with the 
advent of satellite borne sensors capable of producing large quantities 
of mul ti temporal imagery. Analysis of the differences between images 
taken at different times requires that the images be matched spatially 
so that it can be determined how the corresponding data points change 
with time. This spatial alignment or overlaying of images is what is 
meant by image registration. 

One of the primary objectives for overlaying temporally differing 
images is to provide the capability of utilizing the time dependent 
characteristics of a scene for its analysis. For example, imagery of 
agricultural areas are subject to change from one season^ to the next due 

v 

to the growing cycle of the crops. If the classification of different 
crop species is the purpose of a project and the crop types of concern 
are indistinguishable spectrally at a particular time, then their growing 
cycles might provide the necessary information for separating one from 
the other. 

Various i nvestigations* have been carried out attempting to determine 
what type of additional information is obtainable from the use of the 
time as well as spectral dimensions of the imagery. Several studies 
involve use of imagery taken by the mul tispectral scanner aboard the 
LANDSAT 1 satellite which orbits at an altitude of approximately 600 
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miles. The mul ti spectral scanner operates in four spectral bands (0.5 - 
0.6 pm, 0.6 - 0.7 Jim, 0.7 “ 0..8 ynr, and 0.8 - 1.1 pm) and has- a resolu- 
tion of approximately 50 meters along the scan sweep by 80 meters along 
the satellite's path.. 

Utilizing LANDSAT I imagery, Anuta and Bauer [ 5 ] have performed an 
investigation of the use of multi temporal data for the recognition of 
different crop cover types and agricultural and urban feature- change 
detection. With access to mul ti temporal imagery over the same area it 
was found that classification performance for different crop species 
could, be improved during certain growing seasons. This study also showed 
that the automatic identification of urban change was promising but re- 
quired further investigation. Another study also concerned with the 
problem of automatic crop identification is discussed in references [10] 
and [19].. Part of' this investigation involved the effect of the use of 
mul ti temporal data as opposed to a single time data set on the performance 
of correctly- recognizing particular crop types. 

Design of ~a processor to carry out the overlay of images- requi res a 
certain amount of information about the spatial relationships of the 
imagery to be registered. If the images are spatially congruent-, then 
the processor need only find- the relative translation between the images. 
For Images that differ not only by translation, but which are also dis- 
torted spatially relative to each other, the processor must be capable 
of determining the spatial distortions as well as the translation. 

These are the classes of imagery addressed in this study, with the 
primary emphasis- on spatially congruent images. The processor- under 
investigation is that designed to find the-, relative translation- between 
the images, assuming that no relative spatial distortions existv. Such 
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an analysis is directly applicable to a particular class of imagery which 
is currently available in large volume. The mul ti spectra 1 scanner data 
acquired by the LANDSAT I satellite has the property that the relative 
distortions between multipass imagery over the same area are minimal, 
which may be attributed to the stability of the viewing platform. Thus, 
as a first approximation, for small enough subimages the assumption of 
spatial congruence is reasonable. 

Figures 1-1 through 1-4 contain examples of several sets of multi- 
temporal imagery taken by the LANDSAT I mul tispectral scanner. Each of 
the figures displayed is from the 0.6 - 0.7 ym spectral band. Figure 1-1 
shows two images over Tippecanoe County, Indiana which were taken in 
September and November of 1972. A scene from Hill County, Montana for 
two times during the spring and summer seasons is pictured in Figure 1-2. 
Figure 1-3 illustrates an example of a year's span over western Kansas 
where the data was taken in July of 1973 and 1974. And two temporally 
differing data sets over Missouri are shown in Figure 1-4. Note that 
although each of the scenes are recognizable from one time to the next, 
temporal changes are evident. Also observe that the spatial scale of 
both images in each time pair appears to be the same with little relative 
distortion. This supports the spatial congruence assumption for small 
subimages. 

The assumption of spatial congruence for small subimages underlies 
several registration algorithms that have been proposed and implemented 
[ I » 3 , 8 , 9 » 1 1» 30] • This approach al lows the overlay of a sample of 
corresponding subimages from each image assuming no relative spatial dis- 
tortions exist for the small subimages. In the absence of spatial distor- 
tions each of the subimage registrations can be accomplished by a 
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Figure 1-1. LANDSAT I imagery over Tippecanoe County, Indiana. 
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relative translation. However, since the original images may be spatially 
distorted, all of these relative translations of the subimages may not 
be the same, so that the registration of the total image (full frame) 
cannot be accomplished by a simple translation. Therefore, given the 
translations for each of the corresponding subimages, a spatial warping 
or coord inage transformation is applied to one of the images so as to 
align all of the subimages simultaneously. 

Once the relative spatial characteristics of the temporally differ- 
ing images have been established, (which in this case is the spatial 
congruency of small subimages), a determination must be made of the re- 
maining parameters required to achieve registration. For example, since 
registration is, by definition, a spatial matching, it requires a quanti- 
tative measure of the similarity between two images so that a determina- 
tion can be made as to whether the match has been achieved or not. Thus, 
one requirement that must be met is that an appropriate similarity measure 
be chosen. A second parameter that must be considered is the temporal 
change that has occurred, since it is this change that contributes 
largely to the uncertainty in the registration of the images. Although 
the change at a particular data sample is unpredictable, a model of the 
overall characteristics and the spatial correlation of the temporal 
changes will increase the information available for the processor to use. 

Another consideration that should be taken into account is the 
performance of the registration processor. Ultimately, the optimum 
processor is that which yields the best performance. This necessitates 
the development of a quantitative measure of the performance so that the 
optimum can be defined in terms of this measure. One example of such a 
criterion is the variance of the registration error, where the error is 
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the spatial difference between the true registration location and the 
position indicated by the processor. This error arises due to random 
differences in the images resulting from temporal changes that have 
occurred. In this case, the optimum processor is that which minimizes 
the variance. 

The presence of the need to develop a processor capable of register- 
ing imagery has provided the impetus for research in the attempt to find 
a solution to this problem. The following section briefly outlines some 
of the previous developments and analyses that have been carried out. 

1.2, Previous Investigations 

With the availability of large volumes of mul ti temporal images ac- 
quired by the LANDSAT I mul t ispectral scanner several image registration 
algorithms have been proposed and implemented. As mentioned in the 
preceedlng section, the minimal relative spatial distortion between 
mul ti temporal imagery gathered by LANDSAT i has made feasible the design 
of processors based on the assumption that small subimages are spatially 
congruent. With this assumption a sampling of subimages from each of 
the images to be overlayed may be registered and the corresponding' 
relative translation recorded for each sub image. Since the entire images 
may not be spatially congruent, all of the translations need not be the 
same, so that a spatial warping of one of the images is required to 
simultaneously align all of the subimages. Therefore, this minimal 
relative spatial distortion allows the registration process to be broken 
down into two stages. The first is that in which the spatially congruent 
subimages are registered and the second is that in which the spatial 
warping Is carried out. 
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First consider those algorithms concerned only with images that are 
not relatively distorted spatial ly„ Although all of these methods are 
similar in the fact that each performs a. search for a given subimage in 
a larger background image containing the temporal changes, the actual 
procedure for carrying out this overlay and criteria for determining 
when the registration is achieved are different. The LARS registration 
system [ 1 , 5 ] uses the correlation coefficient as the similarity measure 
(Table 6-1), where a maximum of its magnitude indicates the overlay 
location. A complete search is made over all possible .registration lo- 
cations, computing a value for the correlation coefficient at each 
translation. . 

A second algorithm which comes under the heading of sequential 
similarity detection algori thms (SSDA's) [ 8 , 9 ], uses a different simi- 
larity measure at only a sample of points for each translation. The 
similarity measure used is the sum of the absolute values of the differ- 
ences between the corresponding subimage data samples at each translation 
(Table 6-1). With this measure a minimum value indicates registration. 

Another algorithm employing a similarity measure like the correla- 
tion coefficient plus saving computational time in a manner similar to 
the SSDA's, performs a preprocessing step prior to overlaying the imagery 
[30]. Instead of using the original imagery for registration, a gradient 
type operation is applied to each of the images first: then they are 

thresholded to produce binary images (images having values of only zero 
or one). Finally, these binary images are used to estimate the correct 
registration position. 

Once a set of subimages has been registered a spatial warping of one 
of the images may be performed to match all of these subimages at the 
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same time. The algorithms developed to handle this part of the processor 
differ primarily in the amount of spatial warping necessary to adequately 
model the relative spatial distortions. The system at LARS [l ,5 ] is 
set up to handle, a second order two-dimensional polynomial transformation 
over an entire image. This may be refined by using a biquadratic warp- 
ing over smal.ler subimages and then fitting the subimages back together 
[46], which in effect accommodates a higher order spatial warping. A 
further extension is made by an algorithm designed to accommodate spatial 
distortions on a line by line basis [ 1 2 , 20 , 22 , 25 , 26, 33> 42] „ 

A second area of study has concerned analyses of the different 
aspects of the image registration problem as opposed to the development 
of specific algorithms for registering images. One series of -investi- 
gations involves the d Istingul shab \ 1 i ty of the output of the processor at 
the correct registration location compared with the output at all sur- 
rounding locations [6 ,16,32],. With the basic design criterion that 
the processor yield a maximum output at the correct registration posi- 
tion, a processor, has been designed to maximize the ratio of the output 
at the correct registration position to the variance of the output at 
all surrounding positions. in this manner, the output at the correct 
registration position is made more easily distinguishable from that at 
the surrounding locations. 

Another study examines the puli- in range of a processor, where the 
processor 'is simply a product correlator El3,l4l» Thi.s analysis concerns 
the ability to determine the correct direction of movement in the search 
for the registration location, so that it is not necessary to search all 
prospective registration positions. 



10 

With regard to the situation in which the images to be registered 
are spatially distorted relative to one another, several studies have 
been made to determine ways of estimating this distortion. One approach 
uses the properties of the Fourier transforms of the images [16, 17 3 » and 
a second utilizes a least squares estimation procedure [k$]. 

1.3. Outline of Investigation 

The development of a registration processor is begun by first de- 
fining the image and temporal change properties in such a way as to form 
a foundation from which a solution may be approached in an organized 
fashion. For purposes of the present study one of the images to be 
registered is considered as the signal, while the second Image is assumed 
to contain all of the temporal changes and is defined as the signal plus 
additive noise,, where the temporal changes are modeled as additive noise, 
in this fashion the registration of the images may be approached as a 
parameter estimation problem in the presence of noise. The parameters 
to be estimated are the relative translations between the images. The 
noise is the temporal change and any measurement noise that may be 
present in the system. 

Given this definition of how the images and temporal changes are to 
be treated, it is possible to determine a quantitative measure of the 
processor performance. This is done in Chapter 2 where an expression 
for the variance of the registration error is derived. The error is the 
difference between the correct registration position, and that position 
indicated by the processor which is operating in the presence of temporal 
changes. Alternatively, a second cri terion which may be used to evaluate 
the processor is the output signal-to-noise ratio of the processor at 
the correct registration location. 
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Either of these quant? tative -cr? teria now may be utilized to define 
an optimum processor. For example, when considering- the output signal- 
to-noise ratio, the optimum processor is that which maximizes it at the 
registration position. Alternatively, an optimum processor can be de- 
fined as the one that minimizes the variance of the registration error. 
Both of these considerations are explored and related in Chapters 2, 3 
and Appendix A.- In Chapter 2, where. an expression for the registration 
error variance is derived, it is found that use of a filter which maxi- 
mizes the output si gnal-to-noi se ratio, the matched filter (2-36), leads 
to a compact expression for the variance. While in Appendix A it is 
shown that given the general expression for the registration error 
variance, the .processor which minimizes this variance is the same as the 
one which maximizes the output si gnal-to-noi se ratio. Therefore, the 
definition of optimum in terms of registration error variance minimiza- 
tion is equivalent to defining optimum in terms of output signal-to- 
noise maximization since both yield the same processor, which is the 
matched filter. 

Using these results, Chapter 3 presents the method by which the 
matched filter. may be found and implemented. An example is given 
illustrating this. The particular example chosen is designed to conform 
with experimental observations of autocorrelation function estimates 
for the temporal changes found between LANDSAT I images described in 
Chapter 5, which indicate that the autocorrelation function of the 
temporal changes is of an exponentially decaying form. This functional 
form for the autocorrelation function is the model chosen for the 
example. Thus, the example in Chapter 3 derives an optimum processor 
applicable to the registration of images in practice. 
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In Chapter 4 the effect of relative spatial distortions on the 
performance of a processor designed to operate on spatially congruent 
imagery is discussed. This degradation of performance ■ i s found in terms 
of the loss in signal-to-noise ratio as a result of the spatial distor- 
tions, where the relationship between performance and signal-to-noise 
ratio is given in Chapter 2 in terms of the variance of the registration 
error. One of the purposes of this section is to determine the optimum 
size of the images to be registered. For spatially congruent imagery it 
is readily shown that the largest possible image should be used for 
registration. However, this is not necessarily true when relative 
spatial distortions exist, since the spatial distortions cannot be re- 
moved by translation only. For the situation in which the distortions 
are increasing with image size (as for a constant scale factor change) 
it is shown that there is a particular image sizewhich yields a maximum 
output signal-to-noise ratio. Determination of the optimum area size is 
a straightforward procedure which may be accomplished directly by evalua- 
tion of an integral expression. This is illustrated in section 4.5 
where the expression for the output signal-to-noise ratio is evaluated 
by a numerical integration method for several different types of spatial 
distortions. Two general linear spatial distortions are presented as 
well as two examples using models of the spatial distortions observed 
for temporally differing LANDSAT 1 images. In these last two examples a 
straightforward method of applying the analytical results to practical 
image registration is illustrated. 

in Chapter 5 the experimental analyses are begun. This chapter 
concerns the experimental estimation of the temporal change properties 
which are pertinent to the development of an optimum processor for 
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practical image registration. The derivation of an optimum processor 
presented in Chapter 2 is formulated via two approaches -both of which 
require knowledge about certain properties of the temporal changes. The 
first method of solution necessitates the assumption that the probability 
density function of the temporal changes is Gaussian. Further examina- 
tion of this formulation also shows that a model. of the autocorrelation 
function of the temporal changes is required. This latter requirement 
is inherent in' the Gaussian density function assumption. The second 
method of solution requires knowledge of only the temporal change auto- 
correlation function. Therefore, the two properties of concern are the 
probability density function and autocorrelation function of the temporal 
changes. The purpose of this chapter is to provide a model of these 
properties for-LANDSAT 1 imagery so that an optimum registration processor 
may be designed for the overlaying of LANDSAT 1 images in practice. An 
experimental procedure is carried out for estimation of a general model 
for both of these- properties. The first part of the study concerns the 
probability density function and the second part is concentrated on the 
autocorrelation function estimate. The general model observed for the 
autocorrelation function indicates that it is of an exponential form. 

This observation conforms with both the analyses in Chapters 2, 3, and 
Appendix A, and their application to the results of the experimental 
investigation carried out in Chapter 6. 

Chapter 6 presents an experimental comparison of several different 
types of registration algorithms that have been proposed and implemented 
by other investigators [ 1 , 3 , 8 , 9 ,11,30]. The impetus for such an 
investigation lies in the fact that each of these algorithms had been 
developed and tested independently of one another, thus leaving the 



potential user at a loss to objectively compare their performance. This 
study is directed towards the relative evaluation of the techniques 
utilized in these registration algorithms for accomplishing the overlay 
of images. The techniques of concern are the similarity measure, the 
criterion used to measure the spatial matching between the images and 
and therefore indicate the registration location, and the effect of 
different preprocessing techniques on the registration performance. In 
this analysis it is found that the results support the combined analytical 
findings of Chapters 2, 3, and Appendix A in which it is found that the 
optimum processor is a matched filter, and the experimental findings of 
Chapter 5 where the autocorrelation function of the temporal changes is 
observed to be exponential. The processor that experimentally yielded 
the best performance utilized a gradient preprocessing operation on the 
images .prior to overlaying them, which approximates the derivative type 
preprocessing operation derived in the example of Chapter 3 where a 
matched filter is used in the presence of exponentially autocorrelated 
noise or temporal changes. 

Overall, the thesis provides an analytical and experimental investi- 
gation into the design of an optimum registration processor. Using the 
fundamental assumption of spatial congruity between the images, the 
design problem is approached via parameter estimation theory, where the 
parameters to be estimated are the relative translations between the 
Images. In this way i,t is possible to define optimum in terms of a 
quantitative measure, as in the minimization of the registration error 
variance of the maximization of the output s ignal-to-noi se ratio. Once 
the optimum processor has been derived, an analysis of the loss incurred 
by a deviation from the spatial congruity assumption is performed, since 
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this is the situation encountered in practice where relative spatial 
distortions arise due to perturbations in the viewing position of the 
scanner from one time to the next. Application of the analytically 
derived optimum processor to the overlay of images in practice is pro- 
vided by the first experimental analysis in which a model is developed 
for the temporal change properties required for evaluation of the 
optimum processor. Finally, this is followed by the experimental com- 

v 

parison of several algorithms for registering images which acts both to 
provide a relative performance rating of algorithms implemented by other 
investigators, plus to give an experimental evaluation of the analytical 
results found in the derivation of an optimum processor. 
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CHAPTER 2 

VARIANCE. OF THE REGISTRATION ERROR 

An important part of the development of a registration processor is 
the ability tO;.quanti tati vely characterize the registration problem in 
some manner. In this way, one may utilize this measure of performance to 
design an optimum processor which maximizes the performance according to 
the criterion chosen. One such measure of performance is the tolerance 
to within which one Is able to register two images. 

This sectjon concerns the derivation of an expression for the 
variance. of the registration error, where the error is the discrepancy 
between the observed registration position and the true registration 
location. Two models for the variance of the error in the registration 
of two different images of the same scene are developed. The method of 
solution employed is analogous to that used for the determination of the 
error in the measured delay time in a radar system. For purposes here 
the radar system model assumes that the returned signal is a delayed 
version of the original signal corrupted by additive noise. (As adapted 
to the registration of two images, the noise is defined as the difference 
between the two- images at the correct registration posi tion, . and is 
-therefore additive. The time delay corresponds to a spatial translation 
.or displacement. 

Several analyses of the radar problem have been carried out based 
upon different premises [l 5] , [29] , [44] „ These approaches may be 
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categorized as. those- which use the probability density function of the 
noise directly and those which? do not. The first case utilizes maximum 
a. posteriori' probability, maximum likelihood, or minimum mean square error 
estimates. All three- estimators are based upon knowledge of the noise 
probability density function.- The second case is based only upon the 
output of a filter which gives a maximum output at the correct time delay 
when the- input is noise- free. 

An analysis of this sort should prove useful in several respects. 

The results should- give an indication of the best possible registration 
of two images given the models, of the data and noise. Once the models 
of the parameters involved have been found or assumed, an optimum processor 
to implement the overlaying procedure may be developed. Comparison of 
existing registration systems with- the results obtained herein may also 
be performed. However, one must keep in mind the assumptions the entire 
analysis, wi 1 1 be based on-, for different assumptions may yield' different 
results. 

it is assumed in the following investigation that the useful signal 
is present, reducing the problem to one of estimation only rather than 
detection as well as estimation, it is further assumed that the signal 
shape is known and nonrandom, although the parameter that is to be measured 
is a random variable. Since the original signal is known, it does not 
have a probability density function.. However, the second signal does 
conta-ln noise and possibly other perturbations and is therefore a sample 
function of a random process. The problem will be approached with this 
in mind. 
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2.1. Method 1 

The solution to the problem of the first case, in which the prob“ 

ability density function of the noise is directly involved, depends upon 

the cost function which is assigned to the error. and the a posteriori 

distribution, .p^[m(x) ] , of the signal as a function of a parameter, m(x), 

given the received signal, f. A minimum mean square error estimate is 

the mean of p^'[m(x)]; an absolute value cost function gives the median 

of the probability function; the maximum a posteriori estimate yields 

the maximum of p^.[m(t)]. The maximum likelihood estimate may be viewed 

as the same as the maximum a posteriori estimate when there is no prior 

knowledge of the density function of the parameter, p[m(x)], or p[m(x)3 

is assumed uniform over the entire range of interest. All four of the 

above cost functions will yield the same solution when p[m(x)] is uniform 

and the conditional density function p £x (f ) J Is symmetric and unimodal 

m 

[^3]. A Gaussian distribution which has been assumed for p^[m(x)] in 
several analyses,' is a member of this latter class. 

There are'two basic reasons for the choice of this particular type 
of probab i 1 i ty dens i ty function. The first is the availability of a 
closed form analytical solution. The second is that a. Gaussian density 
function is the model that has been used to represent each of the 
classes which comprise the total image [ 36 , 37 ]. 

This derivation of the variance of the registration error is an 
adaptation of the solution obtained by Zubakov and Wainstein [AO], In 
this problem one assumes that the additive noise is jointly Gaussian with 
zero mean. It is also assumed that the density function of the parameter 
{i.e., the misregistration or displacement of the images) is uniform in 
the range of interest. 
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With these assumptions one may construct the likelihood function 
and then find its peak- to determine ' the optimum estimator* ■ 


P x t ^ 

- p + (vV - p-'-v-y- Tx ’(| y . 


x' y' 


• ( 2 - 1 ) 


where. 


A( V r y> 


P f (T x’V 


P(t )' 
r x’ y- 


T ,.T ,(.f) 

x y 


p(f) 

m(x+x ,y+r ) 
x y 


x ,x:- 

x* y 
f(x,y) 

n(x,y) 


likelihood function of the -d isplacemen-t 

parameters, t and t ; 

x y • 

conditional density function of t and x 
gi ven f (x,y) ; x y 

density function of the parameters x^ and x^; 

conditional density function of f(x,y) given 

the translation’ parameters x and, x : 

- x y’ 

density function of f (x,y) ; 

known signal as. a function- of the spatial 
coordinates and the displacement parameters; 

translation parameters 

m(x,y)' +- n(x,y) = received signal; 

addftive noise;, assumed independent of the 
signal . 


Since- the data; that are being analyzed: are discrete, it is con- 
venient to use integer subscripts rather than continuous spatial coordi- 
nates. A. further notational savings is- realized by combining the double 

subscripts into a single subscript. A two dimensional array m. , i = 1, 

ij: 

• • = PJ’J = 1 »■ • **-» q, is converted to a one dimensional data set m^, h = 

1 » .«o-, pq-. This conversion loses noth i ng- from the standpoint of the 

\ 

results to be derived. 

In this discrete case a continuous function has been sampled and 


may be; denoted. 
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v ( vV * m( vvvV 

n h‘ r!x i- y j : 

f. = f (x. ,y .) = m + n 
h .1 j h h 

h =■ \ , . . . , H 

H = pq = total number of samples. 

To arrive. at an analytical result, the probability density function 
of the noise must be known. Because of the many independent contributions 
to the differences between images being registered, it is reasonable to 
approximate the density function as being Gaussian. The probab i 1 ? ty 
density function of the noise is therefore given by , 


P n ( - } . , n x H/2, n] ]/2 


(2ir) | R 


, 1 T d -1 \ 

exp (■ y n R nj 


( 2 - 2 ) 


where R is the covariance matrix of the noise, R , - E[n n, ]. The 
— gh g h 

density functions in the likelihood equation then become. 
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£ “ (f J 9 •••> 

5} ~ ® * 


The likelihood function is then 


a ( t x»V = p(t “ ,T " ) ^ 


(2w) H/2 |r| ,/2 

H H 

exp £ £ Q , f m (t ,t ) 

g h gh g h x’ y 


1 H H 

- — E E Q ,m (t ,t )m, (t ,t ) 

2 h gh g x’ y h v x’ y ; 

Q, ^ = gh element of _R . 


] 


( 2 - 3 ) 
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Since it is only the maximum of A(t ,t ) which is desired, the 

x y 

problem can be reduced even further. Let p(t ,t ) be a uniform distri- 

x y 

bution over the area of interest. This is a reasonable assumption since 

there is no a priori knowledge about the actual distribution. With this 

assumption, examination of (2-3) shows that the only factor which is not 

a constant with respect to r and t is., 

x y 


<j> = 2 2 Q . f m (t ,t ) 

T h gh g h x’ y 


(2-A) 


since ,. 


H H 

2 2 Q ,m (t ,t )m (t ,t ) 

„ h ghgx’ yhxy 


(2-5) 


and p(r ,t ) are constants for all values of t and t , and p(f) does not 
x y x y 

depend upon t and t . Therefore, the maximum of A(x ,t ) is determined 
K K x y x y 

sole:ly by the maximum of 4> 0 The. optimum processor is then the one which 
finds the maximum o.f This type of. processor may be viewed as a cor- 
relator which is weighted according to the inverse noise covariance 
function,. Q , . For the case in which the noise is white with spectrum 

gh 

N /2, the covariance, matrix becomes (N / 2) I (l- identity matrix), and the 

o o - - 

optimum processor is^ simply a correlator. 


* -iT l Vh'vV 

o h 


(2-6) 


Given that the maximum point (this translation position is denoted 

by (t -t )) of the Likelihood function has been found, a measure of the 
1 x y 

accuracy of the estimate is necessary so that the performance of the 

estimator may be evaluated. One such measure is the variance of the 

estimate, about the maximum point of A(x ,,x ),'. For this analysis it is 

x y 

convenient to use ln[A(x ,t )3 which is a monotonic function of A(t x> t ). 

x y ' 



The logarithm of the likelihood function is expanded in a second 
order Taylor series as a function of the delay parameters about its peak 
in the x-axis and y-axis directions separately. It is assumed that 
1 n can be approximated by a second order polynomial around its 

peak. 

Only the results in the x-axis direction are given since the y-axis 
direction results are completely analogous. 
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where 
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A necessary condition for the maximum point of In A(t ,t ) is that, 

x y 


3 in A(t ,t ) 3 In A(t ,t ) 

■ x ? y „ 0 _ x* y 
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. x y 

The Taylor series expansion may then be reduced to 


In A(t ) - In A(t ,t ) 
x y x’ y 
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Rearrangi ng. thi s equation one obtains 3 


(2-9) 
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A( V y = a ^ x *v ex p 
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a 2 = - 
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If one further' assumes a large s ignal-to-noi se ratio, then 
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= variance in the x-di recti on ■ (2-11) 
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since [m (t ,t ) -f ] is dependent only upon the noise and is small com- 
9 x y 9 

pared to m (t ,t ) . 

9 x- y 

Greater insight into the solution may be obtained by looking at the 
result in the frequency domain as opposed to the spatial domain. The 
transformation yields an interesting answer,. The variance becomes, 

(2-14) 
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effective bandwidth in the x-axis direction; 
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M(u,v) . Fourier transform of the known signal; 
S R (u,v) noise spectrum. 

In the spatial domain. 
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(2-16) 


(2-17) 


(2-18) 


With the above assumptions the variance has been reduced to a 
•function of the effective bandwidth and signal-to-noise ratio which are 
expressed .in the frequency domain by equations (2-15) and (2-16), and in 
the spatial domainby equations (2-17) and (2-18). This implies that if 
one can estimate the effective bandwidth and the signal-to-noise ratio 
in the x-axis and y-axi s directions, then the variance of the registration 
error can be estimated. 

Now consider the second derivation for the variance which is based 
upon different assumptions. 


2.2. Method 2 

A second derivation of the variance of the registration error is 
developed in tbi.s section, in this case, the only assumption about the 
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signal and processor is that in the absence of noise, the output of the 
processor will be a maximum at the correct transl ation [29] . No assump- 
tions about the probability distribution of the noise are needed. As will 
be seen, the results of this derivation are similar to those obtained in 
the previous derivation, even though the two approaches are quite unalike. 

The signal corresponding to the image to be overlayed is modeled as 
having two components, the desired signal and additive noise. This signal 
Is passed through a filter and the position where the maximum of the out- 
put signal occurs is taken to be the correct registration position. 
However, since the filter is designed to yield a maximum at the correct 
delay only in the noise free case, this observed registration position 
may differ from the true registration location. The discrepancy between 
these two positions is the registration error. 

First consider the parameters involved. 


f(x,y) 

signal ; 

m(x,y) 

additive noise; 

f (x,y) + m(x,y) 

data~set to be registered; 

* 

X 

-C 

filter impulse response; 

g(x,y) 1 

f(x,y) * h(x,y) = output signal in the absence of 
noise; 

n(x,y) 

m(x,y) * h(x,y) = output due to the noise input; 

z(x,y) 

g(x,y) + n(x,y) = composite output signal used to 
estimate the correct registration position; 

(x.y) 

true registration position; 

(x,y) 

estimated registration position. 


The derivation proceeds as follows. First expand g(x,y) in a second 
order Taylor series about (x,y) . 
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g(x,y) = g(x,y) + g (x,y) [x-x] + g (x,y) [y-y] 

a y 

+ 9 w (x*y) [x-x] [y-y] + i g (x,y) [x-x] 

Ay AA 

• + £ g yy (x,y) [y-y] 2 (2-19) 


where the subscripts denote the partial derivatives with respect to the 
corresponding variables, 


g (x,y> 


3g(x,y) 

3x 


|x = x, y = y 

This subscript notation is used for the remainder of this section. 
Assume that (xrx) and (y-y) are small enough so that all higher order 
terms may be neglected. 

Note that a, necessary condition for a maximum is 


3g(x,y) _ o _ 3g(x,y) 

3x 3y 

Substitute this result into the equation for z(x,y). 
z(x,y) T g(x,y) + 9 xy (x,y) [x-x] [y-y] 

+ i g xx ( x »yH x "^ 2 

- : + i 9 (x.y.) [y-y] 2 + n(x,y).' . (2-20) 

Again use the necessary condition for an observed maximum, 

3z(x,y)/3x = 0 = 3z(x,y)/3y, 

z x (x,y) =0= g xy (x,y) [y-y] 

■"+ 9 vv (x,y) [x-x] + n (x,y) (2-21) 

z (x,y) = 0 = g (x,y) [x-x] 

** / * xy 

+ g yy ( x »y)[y-y3 + n y (x,y). 


( 2 - 22 ) 
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Arrange these equations in terms of (x-x) and (y-y) , the error in 
the registration,. 


g n -g n 

(x-x) = x y y- n -x 

q g -g 
y xx yy xy 


g n -g n 

(y-y) = *?* - x y 


(2-23) 


(2-24) 


q g ~g 
y xx yy xy 


where the ' arguments, (x,y) and (x,y) have been left out for no;tational 
convenience. 

One can now find the variance of the error by taking the expectation 
of (x-x) 2 and (y-y) 2 , where it is assumed that E[x-x] = 0 = E [y-y] . 


Var [x-x] = E[(x-x) 2 ] = (x-x) 2 


Var'[y-y] = E[(y-y) 2 ] » (y-y) 2 
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( 2 - 26 ) 

(2-27) 

( 2 - 28 ) 


One may use these equations to calculate the variance of the error, 
but in doing so, ?t is found that a filter function must be specified 
first. This is intrinsic in the parameters in these equations,, which is 
seen more clearly if one writes these terms as a function of the filter 
(wide sense stationarity is assumed). 


n yCx»y) = //// h y (x-a,y-0)h y (x-y,y~A) 
•• R m (a-y,0-A)da d0 dy dA 


(2-29) 
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n y (x,y)n x (x,9) = //// h y (x-«,y-e) 

• h (x-Y,y-X)R (a-Y,e-A) 

x m 

* da dg dY d'X . (2-30) 

n^(x,y) = //// h (x-a,y-3)h (x-Y,y-A) 

' X A A 

* R^Ca-Y, 3“^)da dg dY dX (2-31) 

9 yv (x,y) “ // h (x-a,y-g) f(a,g) da dg ( 2 - 32 ) 

AA XA 

g y y(x,y) = // h yy (x-a,y-e) f(a,3) da dg (2-33) 

9 vv ( x ,y) = // h (x-a,y-g) f(a,g) da dg (2-3 2 *) 

Ay « xy 

where 

R m (a-Y, g-X) = _ mCa7gymlY,Xj (2-35) 


One now has an expression for determining the registration error 
variance. Equations (2-27) and (2-28) will allow one to find the 
variance of the error for any filter function; however, they seem to 
bear little resemblance to the results in the first section. To obtain 
a particular solution, a specific filter function must be chosen. The 
one that has been picked is intuitively pleasing in two ways; it is an 

* j 

optimum type filter in that it maximizes the signal-to-noise ratio; and 
it yields an answer in terms of the signal bandwidth and signal-to-noise 
ratio. In Appendix A it is also shown that this filter minimizes the 
error variance. This filter is the so called "matched filter." 

Let 

H (u, v) „ «P (-jM*u tiv) , ). (2-36) 

S (u,v) 
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S (u,v) Fourier transform of R (x,y); 

F(u,v) Fourier transform of f (x,y) ; 

H(u,v) Fourier transform of h(x,y). 

Substituting this filter function into equations (2-27) and (2-28), the 
results simplify to. 
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This simplification is seen more easily if one first converts equations 
(2-29) through (2~34) to the frequency dona in and then inserts the 
matched filter. 

One obtains the final result by converting these last- two equations 
to the frequency domain. They then become. 
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(2-39) 
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B 


x 


effective bandwidth of input signal in the x-axis 
di rection; 
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SNR = output signal-to-noise ratio. 

It is seen that the variance of the error is again expressible in 
terms of the effective signal bandwidth and signal-to-noise ratio. These 
results are similar to those obtained in the first section, but the re- 
lationships are not quite as simple. 

• A further simplication can be obtained by making some additional 
assumptions. .The error variance expressions then will be the same as in 

the first method. These assumptions concern the term g" (x,y) in equations 

xy 

(2-39) and (2-40) . If this term equals zero, then the desired result is' 

O 

obtained. Such a condition involves the quantity [ j F (u, v) j ]/[$ m (u,v)] 

since g (x,y) is a function of this quantity. Let K(u,v) = 
xy 

E|f(u,v) I 3/ES (u,v)] for notational convenience. Since K(u,v) is an 

even function of ii and v, in order for g (x,y) to equal zero it is 

xy 

sufficient that, 

K(u,v) = K(-u,v) (2-44) 

or necessary and. sufficient that, 

00 CO CO CO 

/ / uv K(u,v) du dv = / / uv K(-u,v) du dv. 

'0 0 •'0 0 ’ 

The expressions then become 


(2-45) 
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(x-x) 2 = — 

B SNR 
x 


(2-46) 


(y-y) 2 = ~ y ~ • (2-47) 

B SNR 

y 

which are completely analogous to the results obtained by the first 
method. 

An example of when these last assumptions might apply is the follo- 

ing situation. Let F(u,v) and S (u,,v) be bandlimited to W and W in the 

’ m x y 

£ 

respective axis directions. And let [|F(u,v)| ]/[S m (u,v)] equal a con- 
stant. This would occur when the noise spectrum has a shape similar to 
the signal spectrum. In this case, it might be advantageous to model the 
two spectra as differing only by a constant factor for simplicity in 
estimating the variance to be expected. This may be written, 


F (u,v) 

— y - ■■■-- < ! — = c, a constantc 
S (u,v) 


m 


From Equation (2-43) 


W W 

SNR = c / X / Y du dv. 
-W -W 

y y 


So, 


_ SNR 

' c “ 4W W * 
x y 

Then from equations (2-41), (2-42) and (2-43), 

B 2 SNR = 4tt 2 c 
x 

B 2 SNR = 4ir 2 c 

y 


2W- 


L 


LW ) 
x 


(2W ) 

y 


2W 

3 


(2-48) 


(2-49) 


(2-50) 


(2-51) 

( 2 - 52 ) 
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Substituting in the expressions for c, the variances are, 

/ 

3 


-.2 • 
(x-x) = 


4it 2 W 2 SNR 

x 


(y-y ) 2 = 1 


i{7r 2 W 2 SNR 

y 


(2-53) 


(2-5*0 


The respective standard deviations then are. 


Standard deviation of (x-x) - 2 t t W~~ /sNR 

x 


A 

Standard deviation of (y-y) = 


1 


2ttW /SNR 

y 


(2-55) 


(2-56) 


One may obtain a quantitative feel for the values of these expres- 
sions by using the sampling intervals for the LANDSAT-1 data in this 
example. The sampling interval is about 60 meters along the columns and 
about 80 meters along the lines. Substituting these values in equations 
(2-55) and ( 2 - 56 ), one finds that. 

Standard deviation of-error long the 

(2-57) 


, . . 44.1 

1 1 nes = * meters 


y'sirn 

Standard deviation of error along the 


33.1 

columns meters. 


>^SNR 


(2-58) 


These results indicate that with the chosen filter, the standard 
deviation of the registration error is quite, small. 


' 2.3. Conclusion 

' A quantitative measure of the registration processor accuracy in 
terms of the variance of the error of the registration has been derived. 
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With the appropriate assumptions, the variance is shown to be inversely 

/ ' 

proportional to the square of the effective bandwidth .times the signal- 
to-noise ratio. The final expressions are presented' aga.in to‘ emphasize 
both the form and simplicity of their representation. 

Var [ (x“x) ] = 

B SNR . 
x 

Var [ (y-y) 1 = — 

B SNR 

y 

This derivation should prove useful in several respects. First of 

all it may be a basis for the analysis of different registration systems 

by providing a way to estimate the expected accuracy of the system. 

Secondly, it provides a straightforward way of estimating this error. 

As a final consideration the basic assumptions needed for the two 

methods are listed. These assumptions are important and must be 

realized fully to be sure that they apply -to the situation in which they 

will be utilized. For the first method these assumptions are: the noise 

is additive and independent of the signal.; the joint probability density 

function of .the noise' is Gaussian; the a priori distribution of the delay 

parameters is uniform over the range of interest; the variance may be 

modeled in the x-axis and y-axis directions separately; the final result 

is dependent upon a large s ignal-to-noi se ratio [cf. step from equation 

(2-12) to (2- 1 3) ] o The basic assumptions for the- second method are: the 

noise is additive and independent of the. signal; .the noise spectrum must- 

be known; the chosen filter is the "matched filter;" to obtain results 

completely analogous to the first method there is one further assumption 

that must be made about the ratio [|F(u,v)| ]/[S (u.,v)3 [cf. equations 

1 1 m 

{2-kk) and (2-^)3. 
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In particular, note the assumptions related to the probability 
density function and the spectrum of the noise„ The validity of assuming 
a Gaussian density function and of assuming a particular type of auto- 
correlation (or spectral density) function is pursued further in succeed- 
ing sections,, The discussions approach these issues along both analyti cal 
and experimental lines, the first to provide a theoretical basis for what 
should occur,, and the second to provide confirmation of these assumptions,, 
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CHAPTER 3 

IMPLEMENTATION OF A MATCHED FILTER FOR IMAGE REGISTRATION 


In the previous sections it was indicated that the registration 
processor performance improved as the output signal-to-noise ratio (SNR q ) 
Increased. For example, the variance of the registration error Is shown 
to decrease with an increase in the SNR q , thus providing a greater regis- 
tration accuracy (Chapter 2). This result prompts an investigation to 
find the processor which maximizes the SNR q thereby improving the system 
performance. Such an analysis is carried out in this chapter. 

The general form for the processor chosen is that of a linear filter 
whose input is the known signal plus zero mean, additive noise,' and whose 
output is designed to yield a maximum at the correct registration lo- 
cation in the absence of noise. For completeness, a detailed derivation 

of the filter which yields the maximum SNR is provided first. These 

o 

findings are then utilized in an example illustrating the optimum filter 
one obtains in the presence of a particular type of noise, where the 
type of noise reasonably models that observed experimentally (Chapter‘5)* 
The analytical approach to this problem is stressed in this section, 
while its agreement with experimental observation is discussed in Chapters 
5 and 6. 




36 


t 

[// h(x -x,t -y)s(x,y) dxdy]' 

Ay. 


SNR = — 

o 00 


(3-D 


E{ [// h(x^-x,x -y)n(x,y) dxdy] } 

— CO • 


h(x,y) = processing filter 
s(x,y) -known signal 
,n(x,y) = additive noise 
Or equivalently. 


SNR = 


Iff h(t -x,T -y)s(x,y) dxdy]' 
—oo y 


Q 00 00 

//-// h(x "x,x -y) h (x -a,x -g) R_(x-a,y-g) dxdy dadg 

s\ y /v y n 


(3-2) 


Where E{*} denotes expectation and R (x ,x ) is the autocorrelation function 

n x y 

of the noise. 

The filter which maximizes this expression is derived in two basic 
steps. Let, 


h(x,y) = /-/ h w (x-ct,y-B)h m (a,g) dadg (3“3 ) 

where h^ (*,*) is a prewhitening filter; i.e., for, 

n '(x,y) = // h (x-a,y-g)n(a,g) da dg (3-^ ) 

w w 

choose h w (x,y) such that, 

E[n (x,y) n (a,g) ] = < 5 (x-a,y-g) (3-5) 

w _ w 

and h (•,*) is the filter which maximizes the SNR Q .with the prewhitened 
noise and signal. The underlying reason for this approach is that in the 
presence of white noise, the Schwartz inequality may be applied directly 
to arrive at the desired filter function in a simple manner. Schematically, 
the composite filter is as shown in Figure 3 - 1 . 
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s(x,y) — H+] 


n(x,y) 


H 

h (t) 

— ► 

1 h (t) 

1 

w 


i m 


output 


h (t) 


Figure 3~1° Component representation of matched filter. 


Since, 


E[n w (x,y)n w (a,b)] = // // h w (x-a,y-3)h w (a-y,b-£) R n (a-Y,3-?)do!d3dYd5' 


— 00 —00 


(3-6) 


and 


E[n (x,y)n (a,b)] = <5(x-a,y-b) 
w w 


h (x,y) is found by solving the integral equation, 
w 


<5(x~a,y~b) = // // h w (x-a,y-3) h w (a-y,b-£) R n (a-y,3-£) dadgdyd? (3“7) 

* —oo —oo ; 

which may be equivalently expressed as, 

$(x-a,y-b) = // H w (u,v)H w (-u,-v)S n (u,v)e j2lltu(x " a)+v(y " b)] dudv (3-8) 


w 


where, 

H (u,v) -^Fourier transform of h (x,y) 


w 


w 


S (u,v) = Fourier transform of R (t ,t )', or the spectral density 
n of the noise n x y 

The solution to this integral equation is, 

H(u,v-)H' (-u,-v)S(u,v) = 1 (3-9) 

w w n . 

since <$(x,y) = // ,1 * g-i^Cux + dudv, by the properties of the 
— 00 

Fourier transform. 
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In the case of a real filter function, which is the situation 
countered with the registration of imagery, then, 

JL 

H (-u,~v) = H (u,v) 
w ’ w ’ 

so that the solution becomes, 


H (u,v.) = 7 — 7 — — p* 

1 w * 1 S (u,v) 

.n 


Once the noise has been prewhitened, the problem reduces to, 


s w < x >y) 




h m (x,y) 


output 


n w (x,y) 


Figure 3~2. Block diagram after prewhitening. 


s (x,y) = jj h (x-ot,y-g) s (a, 3) dadg 
w w 


n w (x,y) = // h w (x-a,y-B)n (a,g) dadg 


where one must- find the filter, h (x,y) , which maximizes the SNR 


,m 


SNR - 
o 


[// h (t -x,t -y)s (x,y) dxdy] 
m x y w 1 ' 

—CO ' 

00 

EUJJ h m (T x ~x,T v -y)n w (x,y)dxdy] 2 } 

— CD ' 


Uti 1 izing' the whiteness property of n (x,y) , the SNR becomes. 


w 


US h m ( V X, V v)s w (x,y) dxdy ^ 

— 00 ' 

SNR = 

o 03 2 

ff h (t -x,t -y) dxdy 
m x y 1 


en- 

(3-10) 


(3-11) 


( 3 - 12 ) 


(3-13) 


This expression then has an upper bound which is found by applying the 
Schwartz inequality. 
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SNR <-.// s 2 (x,y) dxdy (3-1 h) 

O “* w 

• —CO I 

with equal i ty for, 

h m ( V X »V Y) = k S w (x,y) (3 " 15) 

where k is an arbitrary constant. Letting k = 1 , the maximum SNR q is 
achieved for, 

h (t — x , t -y) - s (x,y) (3-16) 

m -x y w 

or equivalently in the frequency domain, 

-j2ir[uT + vx ] 

H (u,v) = S (-u ,-v) e x y (3-17) 

111 w 

and 

l H w < u > v )l 2 ■ riuTvT (3 -' 8) 

■ n 

The block diagram in Figure 3“1 niay now be replaced by its equivalent 

form, combining .the cascaded filters h (x,y) and h (x,y) , as shown in 
* m w 

Figure 3“3. 


s(x,y) 



H (u,v) 


output 


n(x,y) 

Figure 3“3« Block diagram of matched filter. 


where H(u,v) is the Fourier transform of h(x,y) and, 

H(u,v) = H (u,v)H (u,v) (3“19) 

w m 

Substitution of the expressions for H m (u,v) and H w (u,v) in conjunction 
with the identity, 

S w (u,v) = H w (u,v)S(u,v) 


( 3 - 20 ) 




4.0 


S(u,v) = Fourier transform of s(x,y) 
and the assumption of a real valued signal, -allows one to obtain the 


final expression for the filter which maximizes the SNR q . 

N -j2ir.[uT + vx ] 
n 


(3-21) 


Note that .the desired filter depends only upon the signal, S(u,v), 
and the spectral density, S n (u,v), or autocorrelation function, R n ( T x » T y K 
of the noise. Si-nce the signal is known, the only additional knowledge 
required is the noise autocorrelation function. 

Before proceeding with the example, fi rst consider an equivalent 
form for the block diagram of Figure 3“3* Observe that In Figure 3“3 
the entire processing filter is lumped under the heading of H(u,v). An 
equivalent operation is to prewhrten the received signal, prewhiten the 
known signal,, and then crosscorrelate the two. This is illustrated in 
Figure 3“ 4. 


;{x,y)+n(x,y.) 


h w (-x.,y) 




/ 


output 


h u ( x „y) 

~T“ 

:s(x,y) 


Figure 3-4.. Prewhitening representation of matched filter. 

The reason for -viewing the operations rn this manner is because it is 
analogous to the preprocessing state of image registration, where the two 
images to be overlayed are first preprocessed and then registered via a 
crosscorrelati.on technique. 
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The following example provides an illustration of the filter re- 
quired to achieve the maximum SNR^ for noise with a specific type of auto- 
correlation function. The particular parametric form for the autocorrela- 
tion function used was chosen because it was found to. provide a reasonable 
model of the autocorrelation function encountered experimentally (Chapter 
5). The interpretation of the operations will follow the block diagram 
of Figure 3-4, where a prewhitening operation is applied to both the 
received and known signals,, Let, 


R (t- ,t ) 
n x y 




(3-22) 


Then, 


2 “ a l T J " 6 I T J "j ZlT I UT ~ + v O 

S (u,v) =// A e x Y e 


Y dr^dTy (3-23) 


Carrying out the integration one obtains the following expression for the 
spectral density,.. 


S (u,v). = A'' 
n 


r 2a 1 | 

r 23 1 

“2 rn 1 

(.a + 4 tt u J i 

2 ,22 
L3 + 4ir v J 


S i nee , 


| H w (u,v) 


S n (u,v) 


it follows that; 


(3-24) 


| H (u, v) | 2 = -4— [a 2 + 4tt 2 u 2 ] [e 2 + 4/v 2 ] 

W • 4A cs|3 


(3-25) 


H w (u,v) is found by factoring the above expression which is of the form 
H (u,v)H (-u,-v)./ Carrying out the factoring operation gives 

W W ► , 


H (u,v) 
w 



[a + j2iru] [R + j2irv] 


( 3 - 26 ) 
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At this point one may evaluate H w (u,v) via two di fferent approaches: 
a single two-dimensional filter, or two cascaded one- d'i mens ional filters. 


Single Two-Dimensional Filter 


H (u,v) = [a$ + gj2iru + aj2itv - 4^ uv] 

W 2h/a$ 


(3-27) 


Cascaded One- Dimensional Filters 


H (u,v) = H (u) H (v) 
w w u w v 


(3-28) 


H (u) ** ■ ■ - [a + j2iru] 

w u /2ccA 


(3-29) 


H (v) = 


>/2gA 


[0 + j 2 ttv] 


(3-30) 


in the spatial domain these filters become. 


Single Two-Dimensional Filter 


h “ <X ’ y) = 2A«” bS 5U ’ y) + 6 ^ + 


(3-31) 


Cascaded One- Di mens ional Filters 


\ (x> = ik w 6(x,y) " £ ] 


(3-32) 


V y) ■ w 16 s(x ’ y> • * 1 


(3-33) 


The important point to be noticed here is that the prewhitening filter is 
a derivative type processor. This indicates that when one is registering 
two images by first preprocessing each image and then crosscorrelating, a 
derivative type operator in the presence of noise with an exponential 



autocorrelation function will maximize the SNR q and thus improve the. 
expected registration performance. This prediction is corroborated by 
experimental observations which are discussed in Chapter' 6. 





CHAPTER 4 

CHANGES IN THE OUTPUT S I GNAL-TO-NOi SE RATIO DUE TO 
RELATIVE SPATIAL .DISTORTIONS BETWEEN IMAGES 

In the f.ol.low.ing analysis the change in the outputsignal-to-noise 
ratio (SNR) is computed, for the situation in which the images to be 
registrated are -distorted spatially with respect to each other. The need 
for such an investigation is prompted by the fact that in the case of 
LANDSAT imagery,' smal 1 relative spatial distortions exist between images 
taken on different orbits over the same area. The probable cause of 
this lies in small unavoidable perturbations in the satellite's orbit 
such as altitude, heading, and pitch. Similar, but more pronounced spatial 
distortions a Iso,- occur in aircraft scanner imagery,. 

When the registration problem is modeled as passing the background 
image containing the temporal changes (received signal plus additive 
noise) through a .fi 1 ter so as to maximize the. SNR at the correct regis- 
tration position, the processor used essentially correlates a filtered 
version of the- background image (signal plus noise) with a filtered ver- 
sion of the reference image (signal), where the two images to be regis- 
tered are denoted as the background image and the reference image. The 
background image ‘is a temporal-ly differing and spatially distorted 
version of the reference image corrupted by additive noise. One of the 
parameters that must be chosen in this correlation is the area over which 
the integration is carried out. For geometrically congruent imagery it 
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intuitively follows that a larger observation area yields a higher SNR. 
However, this is not the case when relative spatial distortions dependent 
upon image size exist between the images. For the case in which the dis- 
tortion increases with size, as in a simple difference in scale, it is to 
be expected that beyond a certain image size the SNR will decrease due to 
the fact that the images cannot be simultaneously matched at widely sepa- 
rated points by translation only. Therefore, when relative distortions 
exist it should be expected that there is an optimum integration area 
size which realizes a maximum SNR. This is the problem that is examined 
in this chapter. 

The derivation is divided into two major categories: white noise only 
is present; or nonwhite noise is present. Within each category an ex- 
pression for the SNR as a function of the integration area size is derived 
for both relatively nondistorted and distorted spatia-1 scales between the 
reference and background images. 

The procedure for comparing these two situations involves using the 
filter which maximizes the SNR when the spatial scales are the same (the 
matched filter), and then using a distorted spatial scale version of this 
same filter to observe the effect on the SNR due to relatively distorted 
spatial scales. Theohoice of which spatial coordinate system is dis- 
torted, reference or background image, is arbitrary since the distortions 
are only relative. The reason for proceeding in this fashion -is that 
this models quite well what actual ly happens in practice. Since the 
relative distortion is unknown beforehand, It cannot be corrected for 
prior to processing the images, so that the relatively distorted images 
must be dealt with as they exist. 


REPRODUCIBILITY OF TR- 
ORIGINAL PAGE IS POOR 



For two-dimensional signals, in this case imagery, the output signal- 
to-noise ratio and related parameters are defined as follows. 


— / / 7 h(x x -x,T -y)s(x,y)dxdy]}" 


SNR = 


x y -T -T 


E{[' 


1 


T T 

4 T T / X / y h(T x -x T -y)n(x,y)dxdy] } 

x y -T -T y 

' x y 


(4-1) 


Where, 

SNR = output signal-to-noise ratio 

s{x,y) = signal; reference image 

n(x,y) = additive zero mean noise 

h(x,y) = processing filter 

4T T = observation area size 
x y 

(t ,t ) « translation 
x y 

in this expression the SNR is defined as the ratio of the square of the 
expected value; of the output due to the signal and the variance of the 
output due to 'the noise. The SNR is a function of the observation area, 
through the integration limits. 

To proceed, Tt is necessary that the signal and noise properties be 
defined more completely to allow for evaluation of the SNR. The necessary 
assumptions for'this analysis are, • 

(i) s(x,y) and n(x,y) are independent 


(ii) R s "(x-a,y-b) = s(x,y)s(a,b) 


r N 


( i I i ) R n (x-a,y-b) = n(x,y)n(a,b) = •{ 


6(x-a,y-.b); white noise 


N 

6(x-a,y-b)+R n (x-a,y-b) ; 
c 

nonwhite noise 

It is important:- to note that assumption (ill) states that in the presence 
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of nonwhite noise, the. noise? is. comprised of a- white noise component with 

N . * 

autocorrelation function -y- dfo-a ,y-b) , and a colored noise component with 

autocorrelation function R^ (x-a,y-b). The Inclusion of the white noise 

c 

component is realistic as regards practical measurements and also avoids 
the possibility of singular detection. Later on in the analysis it is 
seen that this assumption leads to a solution employing a prewhitening 
filter as a component of the matched filter for nonwh'i te noise.. 


4.-F. White Noise With No Relative Spatial Distortion 
In th-i:S section the SNR wi IT’ be evaluated for the case where white 
noise only* is present,. 


N 

R (t >t ) = S'(t ,t ) 
n x y 2 x y 


(4-2) 


and no relative spatial distortions exist between the background and 
reference.-, images. For evaluation, of the SNR (eq. 4-1.) a filter must be 
chosen., in keeping with the analyses of Chapters 2, 3', and Appendix A, 
a matched filter is employed. in the presence of white noise this filter 
has an impulse response of the following form. 


h(.T -x,x -y) = s(x„,y) 

a y 

Substitution of this Into the expression for the SNR yields, 

<E[ TPTT f X / Ys 2 ( x »Y)dxdy ]> 2 

xy-T -T 

-X -JL 


(4-3) 


SNR » 
w 


T T 

[£)—"• / X / Y s(x,y)n(x,y)dxdyl 2 } 


(4-4) 


x y -T -T 
' x y 


where the subscript w denotes white noise.. Evaluation of SNR is carried 

w 

out as fol lows , 
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* E [' 4 ' tV / X / Y s2 ( x »y) dxd Yl> 2 = / X / Y * 2 (x,y) dxdy} 2 

X y -T -T x y -T -T 

x y ' .x y 

, T T _ 

• = x / y R g (0,0) dxdy r 

x y -T -T 
' x y 

= R 2 (0,0) 


(4-5) 


E{ [tt^t — ‘/ X / Y s(x,y)n(x,y)dxdy] 2 } 


4T T _ T 
x y -T -T 
1 x- y 


1 


— 0 // X //y s(x,y)s(a,b) n(x,y)n(a,b) dxdy dadb 

(4T. T) -T -T 
x y x y 


1 


T T N 

/J X //y 8 (x-a,y-b)-y- <$(x-a,y-b)dxdy dadb 


(4T T )" -T -T 5 
- x y' x y 


\ 1 


NTT 

-T- / X / Y R (0,0) dxdy 


(4T T ) 1 -T -T 

x y x y 

= jrrrT R s (Oi0) 

xy 


(4-6) 


The SNR is therefore. 


2R (0,0) 

SNR ' = 4T T — 

w x y N 

' r\ 


(4-7) 


Note that SNR w "is proportional to the integration area; this is exactly 
what is to be expected since a larger integration area allows access to 
more signal information. 


4.2.' -White Noise With Relative Spatial Distortion 
As in the last section the noise is assumed to be white. it is also 
assumed that relative spatial distortions exist between reference and 
background images-'. Thus, since only a spatially distorted version of the 
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signal is present and the distortions are unknown, the filter is matched 
to this time distorted signal,, The filter. is represented as, 

(4-8) 

where p(x,y) and q(x,y) are functions of x and y that mpdel the relative 

spatial distortion. With this filter the SNR becomes, 

T -T- 


h( T x “ x ,Ty- y ) « s[x+p(x,y),y+q(x,y)3 


/ X / y s Ex+p(x,y) ,y+q(x,y)] s(x,y) dxdy] 2 } 

v 1/ -T T 


SNR = 
w 


x y -T -T 
...x 1 


T T 
x r y 


(4-9) 


EUp r t / X / Y s[x+p(x,y),y+q(x,y)] n(x J y)- dxdy] 2 } 

v w -T — 


I t IV <T> 

x y -T -T 

, - 7 x y 

This expression may be evaluated as follows 

1 T T . ' , • - 

j / X / Y s[x+p(x,y),y+q(x,y)]s(x,y) dxdy]} 2 

4 Vy -T -T 
x y 

1 T T ' 

= { w T " / X / Y s U+p(x,y) ,y+q (x,y) Js (x,y) dxdy}'" 

x y -T -T 
x y 

, T T - . 

= (iff / X J V 8 [p(x,y) ,q(x,y) ] dxdy} 2 

x y -T -T s • ' 

x y 

, • T - T 

( E b"f / X / Y s [x+p(x,y) ,y+q (x,y)]n(x,y) dxdy] 2 } 

x y -T -T 
x y 


(4-10) 


1 


T T ' 


i/ X JJ Y sLx+pix,y) ,y+qCx,y)Jsla+pU,b) ,b+q ia.bj J 


. (in T )* -T -T 

x Y x y • . • 

* n(x,y)n(a,b) dadb dxdy 

• , N T T 

= 2 ~§~ // X // Y R s (x+p'(x,y)-a-p(a,b),y+q(x,y)-b-q(a,b)] 

(4T T X -T -T 

x y x y 


<$(x~a,y-b) dadb dxdy 


1 ’ 


NTT 


/ X / Y R (0,0) dxdy 


(4T T ) 2 2 -T -T S 
' x y • x y 


N 


4T T 2 s 
x y • 


R '(°»0) 


(4-11) 
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The output signal-to-noise ratio is then, 


2R (0,0) . . T T R • [p (x,y) ,q (x,y)j 

'WD ’ "x’y fF { Vrr/ T [ T R s (0,0) dxdy} (4 “ 12) 

" x y 


SNR. = hi. T 


There are several important properties of this expression that should 


be noted. First, observe the variation of SNR with T and T . 

WU X y 


1 im SNR, = I im 4T T 


2R (0,0) 


T ,T + 0 
x y 


WD 


T ,T -+0 
x y 


x y N 


= 0 


(A- 13) 


Furthermore, if p(x,y) and q(x,y) are increasing in t, that is, if, 

lim |p(x,y)| = « = lim iq(x,y)j 
x,y «- x,y ® . . 

then , 

1,m ' . SNR « 0 

T ,T WD , ' 

x y 


(4-14) 


(4-15) 


since. 


00 R'[p(x,y) ,q(x,y)] 


I! 


» s (o,o) 


dxdy 


(4-16) 


Therefore, when (4-14) is true, SNR^ has a maximum for some finite inte- 
gration area.. ‘'.'Choice of this integration area will give the maximum 
possible SNRyp. •" 

Secondly, notice the way in which the SNR is affected by the spatial 
distortions. An expression equivalent to (4-12) is, 

T w T w R e [p(x,y) ,q(x,y)j 2 




= SNR,;. { 


1 


X f y s' 


J x J 


W 4T T ■ T T 
x y -T “ I 
f X y 


TTm 5 dxdy> 


(4-17) 


Therefore', the s'i.gnal-to-noi se ratio using the spatially distorted filter 
is .less than .SNR^y us i ng the nondistorted filter by the factor of, 


■ tlTrV-J, V — n°,o 7 d * dy} i' 

x y -T -T s 

7 x y 


T T R [p(x,y) ,q(x,y) ] 


(4-18) 
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This follows from (^— 1 7) since, 

| R s Ep(x,y) ,qjfex,y)] | < R s (0,0) (4-19) 

for all x and y„ Note that this reduction is just the square of the 

normalized average -area under the signal autocorrelation function 

Ip (x,y) ,q (x,y) ] with a spatially distorted scale. 

This result indicates that" the reduction in the output signal-to- 

noise ratio for different values of T and T can be easily estimated with 

x y 

a given distortion model by evaluating the reduction factor given in 
(4-18). Finally note that SNR^ reduces to SNR^ when there are no relative 
spatial distortions, i.e., p(x,y) and q(x,y) equal zero. 

4.3. Nonwhite Noise With No Relative Spatial Distortion 

Analysis of the SNR for nonwhite noise, 

N 

R (t ,t ) = 5(t ,t ) + R (t ,t ) (4-20) 

n x’ ,y 2 x y n x y 

w 

requires one more step than in the . white noise case. This is embodied in 
the formulation of the matched filter and involves Incorporation of a pre- 
whitening operation into the optimum filter. A derivation of this imple- 
mentation of the matched filter hs given in Chapter 3 where a block 
diagram (Fig. 3-4) illustrates its construction. The prewhitening 
operation refers to the filter designed to whiten the input noise. 

Addition of the prewhitening filter converts the problem into that 
where white noise has-been assumed. The particular form for this filter 
depends upon the noise autocorrelation or spectral density function. The 
background image is passed through the whitening filter and then a second 
filter is chosen to maximize the SNR of the prewhitened signal plus noise. 
These two filters in cascade form the matched filter (Fig. 3-3 j . 
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This approach requires an alteration of the SNR formula. It is 
assumed that the received signal plus noise has already been processed by 
the whitening f-ilter so that it ?s necessary to deal only with white noise 
in the choice of -the maximization filter. 


{E[ 7 


1 


4T t t t 
x y -T -T . 

SNR... .= I * 1_ 

NW . T T 


T T 

/ X / Y h ( V x ,Ty-y)s w (x,y) dxdy]}' 


(4-21) 


E Ulyf J / X / Y h(x x -x,T -y)n w (x,y) dxdyJ Z } 


4T T _ T 
x y -T ~T 
1 x y 


Where, 


s w (x,y) = // h w (x-a,y-b) s(a,b) dadb prewhitened signal (4-22) 

n (x,y) - jj b (x-a,y-b)n(a,b) dadb prewhitened noise (4-23) 

w * w 


The prewhitening filter, h (x,y) , is designed such that, 

w 

N 


n w (x,.y)n w (a,b) = 6(x-a,y-b) 


(4-24) 

that is, n (x,y) is white noise. Conversion of (*f-24) to the Fourier 
w 

transform domain yields the following relationship between the power 
transfer function of the whitening filter and the noise spectrum. 

• (4-25) 


N 


[H (u,.v) 
w 


1 


2 S (u , v) 
n 


where, 

H t (u,v) = transfer function of the prewhitening filter 
w 

$ n (u,v) = noise spectrum 

Note that inclusion of the white noise component in the expression for the 

2 

autocorrelation^ function, eq, (4-20), insures that jH (u,v)| < °° for all 

frequencies. This avoids the singular detection problem. 

As before,: for white noise the filter is matched directly to the 
signal, which has been passed through the whitening filter in this case. 
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h( V x ’V y> = s w (lx,y> 

With this filter the expression for the SNR becomes, 

T T „ 


X / Y s^Cx,y)dxdy]}' 

4 X V -T -T w 


SNR NW " 


Ei[ vrr I X / Y s (x,y)n (x,y)dxdy] 2 } 

H 'xy -T "T 
7 x y 


(4-26) 


(4-27) 


which may be. evaluated as follows, 

{E[ vrr f y 4 (x ’ Y)dxdyl}2 15 { vrr / x /J Y s w (x>y) dxdy} 

x y -T -T x y ~T„ -T. 

' V \t‘ ** 


T T 


x y 


x y 


- {^fV/ Tx A R s (0 ’ 0) dxdy}2 
x y -T -T w 
7 x y 


= (0,0) 

w 


wher&, 


(4-28) 


R (0,0) = s 2 ’(x,y) = //// h* (a,b) h (c,e)R (a-c,b-e)dadbdcd'e (4-29) 

C j w — V\/ fi 


is the energy in- the. prewh I tened signal. 

, 1* T o 

E{ [ y T - - T — / X / Y s w (x,y)n w (x,y)dxdy] } 


x y: -T -T 
7 x y 


T T 


1 

x 2 

;r 

< W x T y> 

- T x 

1 

N 

o 

(ltT x T y )2 

z : 

1 

N 

o 

W* 1 / 

1 : 

1 

.. _ v 2 

N 

0 

’ 7 ■ 


w ' .w 


x y 


x y 


x y 


wK <o ’ o) ' 


^ ^T>T> TTnrr»%v 
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The SNR is then. 


2R (0,0) 
s 

SNR. ■ = 4T T — ^ 

NW - x y N 


(4-31) 


Note that SNR^ is proportional to the integration area, 4T^T^. This 
is what is to be expected. Comparing this expression with that obtained 
for the white noise situation, it is found that the two are analogous, 
differing only, by the signal energy used. For the nonwhite noise case 
the prewhitened- signal energy is used as opposed to the original signal 
energy. 


4.4. Nonwhite Noise With Relative Spatial Distortion 
When spatial distortions are present, instead of being able to use 
s(x,y) in forming the receiving filter, it is necessary to use 
s[x+p(x,y) ,y+q(x,y)] , a spatially distorted version of the signal, where 
p(x,y) and q(x,y)- functionally model the distortions. In this case the 
prewhitening filter again is used. However, instead of passing s(x,y) 
through the whitening filter, s [x+p(x,y) ,y+q (x,y) ] is input to the 
whitening filter. This corresponds to the situation faced in practice 
where the time' distorted signal is the only version available. The 
processing filter' then is matched to the whitened distorted signal. 

CO 

h (t -x,t -y) = z(x,y) = jjh (x-a,y-b) s [a+p(a,b) ,b+q (a,b) ]dadb (4-32) 

x . -oo w 

z(x,y) = prewhitened spatially distorted signal 
For convenience in the derivation, the .fol lowing equivalent relation for 

z(x,y) is used. 

00 

z(x,y) = // h (a,b) s[x-a+p(x-a ,y-b) ,y-b+q(x-a,y-b)]dadb 
. w 


(4-33) 
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Substitution of z(x,y) into^he expression for the SNR yields, 

T f 

.2 

J_ JJ z v*,y/s w uc,y; dxdyj . 

y 


SNR 


{E[ ITrT“ / / Y z ^»y)s (x,y) dxdy] } . 

x y -T -T . w 

** 1 — , - 


NWD T T 

£{ [ pp '~ f ~ / X / Y z '(x»y)n (x,y) dxdy] 2 } 

4 x y -T -T w 

1 x y 


(4-34) 


Evaluation of this expression gives, 
T T 


T T 


{E / X / Y z'(x,.y)s w (x,y)dxdy]} 2 » J X / Y zTxTyJs^vyTdxdy} 


x y -T -T 
' x y 


" { ?rV l Y R zs (x ’ y) dxdy} ' 

X ¥ -T "T W 
7 x y 


T T 


x y -T -T 
7 x y 


(4-35) 


wherey 


R (x,y) = //// h (a,b) h (c,e)s Lx-a+p(x-a,y-b) , y- b+q(x-a,y“b) J s ( x- c, y -e) 

ZS W -co w w 

* dadbdcde 

00 

= //// h (a,b)h (c,e)R [a-c-p(x-a,y-b) ,b-e-q,(x-a,y-b) ]dadbdcde 
. w w s 

(4-36) 

is the crosscorrelation function between the prewhitened distorted signal 


and the prewhitened original signal.’ 

T T 

E{ I^ t V f X J Y 2 ( x »y)n (x,y) dxdy] 2 } 

xy -T -T 
7 x y 


1 


T T 


Y If* ff V z (x>y)z(a~,b) n ( x , y ) n ( a , b ) dxdydadb 

(4T T ) -T -T 
x y x y 

1 NTT 


~Y If X ff y z(x,y)z(a,b) <$(x-a,y-b) dxdydadb 


(4T T ) 2 2 -T -T 
x y x y 


i NTT, 

. 1 o r x r v 2, 


— / x / Y z (x,y) dxdy 


(4T T ) 2 2 -T -T 
xy x y 

wV -r l ?rr / * / y R z <x ' y) dxdyl ' 

x y x y -t v -t f 


(4-37) 
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where, 

CO 

Mx,y)=J7//s |x-a+pU-a ,y-b) ,y-b+q U“a,y-bJ J s [x-c+p (x-c ,y-e) ,y-e+q(x-c,y-e) ] 
^ —00 

• h (a,b)h (c,e) dadbdcde 

w w j 

00 . i 

= //// R [a-c-p(x-a,y-b)+p(x-c,y-e) ,b-e-q(x-a,y-*b)+q(x-c,y-e)] 


h (a,b)h (c,e)' dadbdcde 


w 


w 


(4-38) 


is the energy in the prewhitened spatially distorted signal. Note that 

1 

the independence of z(x,y) and n (x,y) in the first step of eq. (4-37) 

w 

follows from the' fact that z(x,y) depends only upon s(x,y) and n (x,y) 

w 

depends only upon n(x,y). Also note that R^(x,y) is not a function of x 
and y for a linear distortion (i.e., p(x,y) and q(x,y) are first order 
polynomials) .' 

The SNR becomes, 

I 1 Jx , T y 1 

zs 


2R (0,0) r 


SNR., »4T T 


w , . 


NWD x y N 


T T R (x, y) 


5TT- r r 

7 X V 


1 r T x Jy R_ (x,y)dxdy 
4T T J ^ 


x y -T -T 

1 x y 


w 


?fV'C U R z (x ’ y)dxdy 

(4-39) 


x y . 


There are several important properties of this expression that should 
be observed. The expression can be rewritten in the following equivalent 
form. 


SNi W= s,)R NU 


T T R (x,y) 
1 r x r y z ' 


I x y- -T “T s 
' x y w 


r T x r T y R zs (x,y)dxdy 

4T T } 1 W 


x.y -T “T 

* X 


4tV I* ! y R x< x >v)<lxdy 
• x Y “ T x T y (if -40) 


It is evident from (4-40) that use of a spatially distorted signal results 

in the reduction of SNR.... relative to the undistorted case by a factor of, 

NW 
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1 r T x rv R (x,y)dxdy 

TiTl V w 


!x y -T -T 
' x y 

v T T 


, T T R (xyy) ‘ * x'v -T -T 

l* 1* *** r 

x y w ^ WT~ / x / Y R ( x >y) dxd y 

x y -T -T 
' x y 


(4-41) 


The inequal ity - follows from examination- of the expressions for R (0,0), 

w 

R (x».y) , and- R (x*y>. 

^ Zix* 

W 


R (x,.y) | < R. (0,0) for all x,„y 
s w 


(4-42) 


T T T T 

1/ X / Y R o (x,.y)dxdy| < |/ X / y R_(x,.y).dxd.y| for all T and T (4-43) 

-T -T zS w • -T -T * x y 

x y x y 


This result shows that the reduction i'n the output signal-to-noise ratio 

with a- given distortion model - for different values of T and T can be 

x y 

estimated b,y evaluating the reduction factor, (‘4—41 ) - it should also be 
noted that SNR^^ reduces to SNR'^ when -no distortions are present (i.e., 
p(x,y,)' and q.(x,y) equa.P zero). 


Next to be* considered is the variation- of SNR..-. with T and T . 

NWD x y. 


1 ' m SNR muo = °‘ 
T T -*■ 0 NWD 
x y 


(4-44) 


Siince- Hm SNR^ - 0 and SNR' < SNR . 


TT + 0 
x y 


Also,, if |p(x,y)| and |q(x,y)| are. increasing in x and y, that is 


1 In* i'P(x»y) I - “= fim |q(x,y)| (4-45) 

x, y ->■ “ x,y -»• ” 


then. 


T . snr nwd - 0 

T | 00 

x y 


(4-46) 


if R (x,y) is not a constant, function of x. and y, then (4-46) fol lows since 
T T T T 

/ X / Y R (x,.y)dxdy and / x / ^ R- (x,,y)dxdy are finite. if R (x,y) is 


T t zs 

-T -T w 

x y 
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a constant with respect to .x and y, then (4-46) is also true since 

T T T T 

/ x / y R (x,y)dxdy < « for all T and T while lim / x / y 

-T “T zs w x y TT •>" -T -T 

x y x y x y 

Mx,y) 

Since SNR... ,'is nonnegative for all T and T , then if (4-45) is 
NWU x y 

true then there must be a maximum of SNR.,,,- for some finite integration 

NWD 

area. These choices of T and T will yield the maximum SNR., 1tr .. Given 

x y NWD 

a model of the distortions, these values of T and T may be found by 

’ x y 

carrying out the required integration in eq. (4-39) 


4.5. Examples of the Loss in the Output Signal-To-Noise Ratio 
Due to, Different Types of Spatial Distortion. 

in this section several examples are given illustrating the loss in 
the output s ignal-to-noi se ratio when a processor designed to register 
spatially congruent imagery is operating in the presence of spatial dis- 
tortions. The loss in the output s ignal- to- noise ratio is examined for 
three types of distortion: the first is a linear scale distortion; the 
second is the situation where the two images are rotated relative to one 
another; and the third is one in which relative distortions representative 
of those observed between mul ti temporal LANDSAT 1 images are considered. 
The first two examples concern general types of distortion. However, the 
last set of examples where the observed distortions between LANDSAT i 
images are considered, provides a means of applying the analytical ex- 
pressions for the loss in the output signal-to-noise ratio to the over- 
laying of images in practice. 

For the examples. presented the noise (temporal change) is assumed 

to be. white so.'that the expressions developed in Section 4.2 may be 

» 

used. Equation (4-12) is used in a slightly rearranged form to evaluate 
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the normalized output s ignaT-to-no i se ratio (denoted by SNR., in this 

N 

section). This equivalent form is. 


SNR WP 

2R s (0,0)/N o 


= hi 


-Vi / 

4T -T -T 


T T R Ep(x,y),q(x,y)] 


2 


R s (o,0) 


dxdy 


where, 


(4-47) 



and the signal-to-.noise. ratio has been normalized by 2R s (0,0),/N o . 

In order to evaluate (4-47), it is necessary that a model of the 
signal (image) autocorrelation function be chosen. • For these examples 
an exponential .autocorrelation function was used. 

R s (x,y) = R s ('0,0) exp {- -1—1 L^i*} (4-48) 

where. 


r = characteristic length of the autocorrelation function 
Substitution of this expression info (4-47) yields. 


£NR WP 
2R (0,0)/N 


= 4T 


, T T 

— Y 'l / exp [- 
4T -T -T 


l'p(-x.y) 1 _ lq(x,y) 


1 2 


-] dxdy 


(4-49) 


s o 

Given a model of the distortions, p(x,y) and q(x,y), (4-49) can be 


evaluated' to determine the SNR.., 

N 


4,5.1. Linear Scale Distortion., 

The first type of spatial distortion examined Is that of a linear 
scale distortion.. In .matrix notation this is represented as. 


X'* 


1 + c 

o T 


X 

y' 


0 1 + c 


y 

— — 


L- 

j 


— _ 


(4-50) 


where.,. 
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(x,y) = reference Image coordinate system 

(x* ,y') = coordinate system of' image to be overlayed on reference 
image 


c = scale factor distortion 


Si nee. 


x 1 = x + p(x,y) = x + cx 
y' = y + q(x,y) = y + cy 

from the definition of p(x,y) and q(x,y) in Section h.2, then, 
p(x,y) = cx 
,q(x,y) = cy 


(4-50 

(4-52) 


(4-53) 

(4-54) 


Substitution of these expressions for p(x,y) and q(x,y) into (4-49) yields 
the equation for the SNR^ in the -presence of a linear scale distortion. 


.-'4T 2 ~~ j I exp [- M|x| - i^J-|y|]dxdy 
• -t - T r r 


(4-55) 


Note that the SNR^ is dependent upon two parameters: 4T , the integration 
[cl ‘ 

area; and J — k the ratio of the scale distortion factor to the character- 
r . 

istic length of- the signal' (image) autocorrelation function. 

Evaluation of (4-55) was carried out by a numerical integration 
method. Simpson's rule for approximating the integral ofa function by 
the piecewise integrals of quadratic polynomials was used [40], This 
procedure proved both straightforward- and accurate. With a division of 
the interval, 2T, into 100 increments it. was found that a tolerance of 
about 0.005% was observed. Also, use of this method of integration was 
not time consuming and was easy to implement since it only involved cal- 
culation-of a weighted sum of the integrand at each of the increment end- 
* points. The general formula for integration via Simpson's rule is shown 
below for a doub.le integral. The integration is carried out by first 
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integrating with respect to x and then .with respect to y, i.e., 

T T T 

/ / f(x,y) dxdy = fj g(y) dy 

-T -T -T 


where. 


g(y) = / f(x,y) dx 
-T 


i odd 


where, 


(4-57) 


Using Simpson's rule the approximate integral is, 

T T N-l N-2 

/ / f(x,y)dxdy ~ (4 Z g(-T+ih)+2 Z g(-T+ib)+g (~T) + g(T)} (4-58) 

-T -T * i=l i=2 


g(y) = “ (4 Z f ( -T+ j h , y ) +2 Z f(-T+Jh,y) + f (~T,y)+f (T,y) } 
i j=l j=2 

j odd j even 


(4-59) 


N = number of divisions of the interval, 2T; must be an even 
number 

h = 2T/N, the increment length of each division 
For this computation the same number of increments are used for both 
variables of integration. 

Figures 4-1 through 4-4 illustrate the -relationship between the 
SNR^j, the integration area, and the linear scale distortion. Figures 
4-la and 4Hb show the square root of the -SNR^ (denoted -by /SNR^) 
for different values of scale distortion as a function of 2T, the square 
root of the integration area. The reason for thi s .particular choice for 
ordinate and abscissa is that the /SNR^ in the absence of spatial dis- 
tortions is l inear in 2T with a s.lope of one. This is evidenced by 
letting p(x,y) and q(x,y) equal zero in (4-47). In this way it is 
possible to plot the results on a linear scale. 
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Two_ separate figures are presented so as to illustrate the distor- 
tion factor effect over a wide range of values, in Figure 4-la the 
/SNR., versus 2T curves are given for equal to 0, 0.01, 0.05 and 0.1. 
In Figure 4-lb M equals 0, 0.001, 0.0025 and 0.005. There is consider- 
able convenience in being able to represent the scale factor, c, and the 
characteristic length, r, in a single term. In this way each curve is 
representative of a family of values of c and r. For example, a value of 
= 0.01 can correspond to a value of 0.01 for |c| and 1.0 for r as 
well as the combination of 0.02 for |c[ and 2.0 for r. 

As predicted by the derivation in Section 4.2, in the presence of a 
scale distortion there is a finite integration area which yields a 
maximum output signal-to-noise ratio. This is illustrated by the occur- 
rence of a peak in the curves for nonzero c. This result indicates that 
there is an optimum integration area size to use in the presence of a 
linear scale distortion when the registration processor is designed for 
spatially congruent imagery. Given the scale distortion, the optimum 

choice of integration area size is that which yields the maximum SNR^. 

I c I 

For example, with a distortion of = 0.05, the maximum SNR^ is 
found for 2T a 50. 

Several other observations also may be made from Figures 4-la and 
4-lb. Again as predicted by (4-47) the /SNR^ is linear in 2T with a 
slope of one for no scale distortion. Also note that a larger distortion 


N‘ 


factor requires a smaller Integration area to yield the maximum SNR 

For example, with = 0.01 the 2T for a maximum SNR N is 252, whereas 

for = 0.05, the 2T for a maximum SNR„ is 50. 
r N 

Figure. 4-2 follows from the observations made in'the first figure. 
Figures 4.-la and 4-lb illustrate that given the scale distortion factor. 



2R (0,0) /N 

s o •• 1 

■ 10 1 


] It 

Figure 4-2, (integration area) yielding the maximum normalized output 
• signal -to-no?se ratio vs. linear scale distortion. 


maximum • ^ ^2 


2R (0,0)/N 4 

S O • 



Figure 4~3. Maximum of (normalized output si gnal-to-noise ratio) 
vs-, linear scale distortion. 
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Table 4-1 . 


2T yielding the maximum "SNR^ 

y^SNR.T for different values of 
N 


and the maximum 

M. 

r 


id 

r 

2T for 

max SNR.. 

N 

max A NR* 

N 

0.001 

2513 

• 814.5 • 

0.0025 

1005 

325.8' • 

0.005 

502 

162.9 

0.01 

252 

81.5 

0.05 

50 

16.3 

0.1 

26 

8.1 
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there is an optimum integration area size yielding the maximum SNR^. 

This suggests that given the scale distortion, it is possible to find 
that integration, area giving the maximum output s i gna 1 - to-noi se ratio. 
Table 4-1 lists a sample of values of the scale distortion and the cor- 
responding 2T yielding the maximum normalized SNR^. These values were 
used to generate ,Fi gure 4-2 which Is a plot of the value of 2T yielding 

the maximum SNR., versus the linear scale distortion. Note that the 
N’ 

function is linear on a log-log plot. This indicates that the relation- 


ship is of the following form, 

- -t$ 

‘ ' 2T ** cc -Lsi- 


where a and 6 may be determined from the curve or Table 4-1. The values 
found for a and 3 are, 

ct ~ 2.52 
3 = -1 


Therefore, 


2T for max SNR,, 2 2.52 


This result suggests that given the linear distortion factor, the area 


yielding the maximum output signal-to-noi se ratio can be easily computed 
from (4-61) 

Figure 4-3 is a plot of the maximum /SNR^ versus the linear scale 
distortion. The values in Table 4-1 were used to generate this curve. 
This graph displays the maximum attainable /SNR^ in the presence of a 
given scale distortion. As in Figure 4-2, the relationship is linear in 


a log-log plot which indicates that the maximum SNR^ varies with the 
scale distortion in the following manner. 
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v^NRT = a 

N 


r Mf 

r 


( 4 - 62 ) 


or equivalently. 


SNR n « (a 


Ml 

r 


13 


} 


(4-63) 


where a and g may be found directly from the curve or Table 4-1. 

a 2 0.81.45 
6 - “I 

Substi tutioii of these values into (4-63) yields, 

r rl 

M 


SNR., -S {0.8145 
N 


} 


(4-64) 


Therefore, the maximum possible SNR^ for a given scale distortion may be 
found directly from (4-64). This provides a straightforward way of esti- 
mating the best possible performance in the presence of a given scale 
distortion. 

A third means of analyzing the relationship between the SNR^ and 
scale distortion is provided in Figure 4-4. This figure is a graph of 
the ^SNR^ for different integration area sizes versus the linear scale 
distortion. Figure -4-4a illustrates the relationship for 2T equal to 30, 
50 and 100, while Figure 4-4b shows the corresponding results for 2T equal 
to 250, 500 and 1000. For design .purposes this may be utilized in the 
following fashion. If the .integration area is given, then the maximum 
allowable distortion can be determined once a loss criterion in terms of 
the reduction in the SNR^ due to spatial distortion is decided upon. For 
example, if an integration area size with '2T = 100 is chosen with a toler- 
able loss of 19% in the 'SNR^, then the maximum allowable distortion is 


M. 


z 0.002.. This is found directly from Figure 4-4a. Since .a loss of 




„ 1/2 
Figure 4-4. (Normalized output si gnal-to-nolse ratio) for 

different integration area sizes vs. linear scale 

distortion. 
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19? In the SNR^ is- equivalent? to. ( 0 . 81 ) SNR^, this corresponds to 0.9/SNR^. 
With 2T equal to 10CX, the /s'NFL, has a maximum of 100' (i.e., for c = 0), 

so that 0.9 /S' NR,. =■ gro". Determination of the scale distortion for a value 

N id 

of /S'NR^ = 90 on the 2T = 100 curve yields z 0.002. 

One’ other property of the curves may be observed. Note that the 

curves cross at specified values of linear scale distortion. For example, 

beyond a certain’ value of the AnR., for 21'= 100 is less than that 

r N 

for 2T =-50. This, fol lows.- from the results obtained in Figure 4-1, 

where it is shown that for a given distortion there is a value of 2T 

yielding the maximum SNR..,, all other values: of 2Tyielding a lesser SNR,,. 

N N 

Figures 4—1 through 4-4’ corroborate the analytical results derived 
in section 4.2. By utilizing them as outl ined, above, they provide a 
means- of choosing the optimum integration area' s-.ize in the presence of 
linear scale distortion; 


4.9.2.. Rotation Distort! on. 

The" second general type of distortion examined is that where the 
two images are rotated relative to one another. This spatial relation- 
ship is represented in' matrix form' by. 



T + (cos'0-1) sinfi 
sine 1 + {cos'0-1) 


x 

y 


(4-65) 


where, 

(x,y,) = reference image, coordinate- system 

(x',.y ') = coordinate system of image to be registered with' the 
reference- image 

9 = angle of rotation between the images- 

Si nee, 


SKBSKSS’’ 
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x* = x + p(x,y) 

y' = y + q(*,y) 

from the derivation in section 4.2, therefore, 

p(x,y) " x (cos 9-1) - y sin9 (4-66) 

q(x,y) - x sin0 + y(cos8-l) , (4-67) 

Substitution of p(x,y) and q(x,y) into (4-49) yields the expression for 

l 

the SNR^ in thepresence of an angular distortion. 


(4-68) 

2 

Note that the SNR^ is dependent upon: 4T , the integration area; 6, the 
angle of rotation; and r, the characteristic length of the image auto- 
correlation function. 

As in sect ion. 4.5. 1 the integration of (4-68) was carried out by 
using Simpson's r,ule approximation to the integral. 'Refer to 4.5.1 for 
a description of , how this is implemented. 

The relationships between the SNR^, the integration area and angular 
distortion are T 1 1 us t rated in Figures 4-5 through 4-8- , The contention 
made in section 4.2 that there is an integration area yielding a maximum 
output signal-to-noise ratio for a given distortion .is borne out in 
Figures 4-5 and 4r6. Both figures show the /SNR^ as a function of 2T. 
Figures 4**5a and 4-5b illustrate this relationship for several different 
angular rotations where the characteristic length, r, is equal to 2. 
Figure 4-6 illustrates the same relationship with a characteristic 
length of r = 5*' In both cases the results reduce to the expected linear 


' T T 

— 2 / / exp 
4T -T -T 


|x(cos6-l)-ysin6 


xsin6+y(cos0-l ' 




Figure b~5. 


3/2 


'(Normal i:zed ..output -signal -to- noise -ratio) 
for different rotation angles vs. (integration 





1000 5000. 

21 

1 /2 

Figure 4-6. (Normalized output s i gnal-to-noi se ratio) 

for different rotation angles vs. (integration 


rea, 
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relationship between the v/SNR^ and 2T for no angular distortion, i.e., 
9=0. As was observed for a> linear scale distortion, there is an inte- 
gration area' yielding a maximum SNR^ in the presence of an angular dis- 
tortion, as is evidenced by the peak in each of the curves with nonzero 
0. For example,, with a rotation, angle equal, to 1°;. a value of 2T z 290 
achieves the maximum SNR.,. 

For design, pruposes in choosing an optimum integration asirea size, 

the relationship between the. value of 2T yielding the maximum SNR^ versus 

the angular distortion is given in Figure -4-7. Using this figure, it is 

a straightforward procedure to choose the integration area size which 

allows the maximum SNR^ given the value of the angular distortion. This 

relationship is illustrated' for two values of the characteristic length, 

r = 2 and r = 5* Note that each is linear on a log-log plot for the 

values of 6 chosen (0 < 9 < 10°). This indicates that the relationship 

is of the following form for this, range of 0. 

g 

2T for max SNR^ z a^0 r 0 < 0 < 10“ (4-69) 

where and g^_ depend upon r. The, data samples used to generate the 

curves in Figure 4-7 are listed' in Table 4-2. Using, these values, 

and g become, 
r 

a 2 * 288 0 2 = -1 

a c t 720 g r = -1 

Therefore, for r = 2, 

2T for max SNR^ z 288 0~^ (4-70) 

0 < 9 < 10 ° 

and for r = 5,. 

2T for max SNR^ « 720 0~ l ; 0 < 9 < 10° 


(4-71) 
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Table 4-2. 2T yielding the maximum SNR for 
different rotation angles,- 


r = 2 


r = 5 


e° 

2T for max SNR:. 

N 

2T for max 

0/25 

1152 

' 2879 

.0.5 

576 

1440 

0.75 

384 

960 

1.0 

288 . ■ 

720 

2.0 

144 

360 

3.0 

96 

240 

5.0 

58 

- 144 

10.0 

28 

72 


N 


Table 4-3. Maximum v^NR~ for different 

N 

rotation angles. ' ; 


e° 

r = 2 

max /SNR,. 

N 

• r = 5 
max ,/SNR| 

0.25 

373.3 

933.4 

0.5 

186.7 

466.7 . 

0.75 

124.4 

311.1 

1.0 

93.3 

. 233.3 

2.0 

46.7 

116.7 

3.0 

31-1 

77.8 

5.0 

18.7 

46.7 

10.0 

9.3 

- 23.4 
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Extrapolation of these last expressions results in a general approximate 
formula for determining 2T corresponding to the largest SNR^ in the 
presence of angular distortion for small values of 0. 

2T for. max SNR., s 144 r(e" ] ) (4- 72) 

0 < 0 < 10 ° 

This was shown to be true by evaluating (4-72) for different values of r 
and observing whether the indicated value coincided with that observed. 

In all cases they agreed. This result indicates that given the angular 
distortion and characteristic length of the image autocorrelation func- 
tion, the area 5 yielding the maximum SNR^ can be computed from (4-72) 
for small values of 0. 

Figure 4-8 ls» a plot of the maximum /SNR"^ versus the angular distor- 
tion for two values of the characteristic length. These curves were 
generated from the sample values listed in Table 4-3. This graph displays 
the maximum attainable /SHR^ in the presence of a given angular distor- 
tion. As in Figure 4-7 the relationship is linear in a log-log plot for 


both values of r over the range of 0 used. This indicates that the 

maximum /SNR., varies with the rotation angle in the following manner. 
N 


/SNfC.s a 0 
. n r 


0 <8 < 10 c 


or equivalently. 



0 < 0 < 10 ° 


(4-74) 


where a r and 8^ depend upon the characteristic length. The values of a r 
and $ r maybe found from -Table 4-3 or Figure 4-8. 
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a 2 » 93.3 3 2 “ - 1 

« 5 “ 233.3 8 5 ? "1 

Therefore, for r « 2, 

2 

max SNR n s [93.3 0 3 (4-75) 

0 < 0 < 10 ° 
and for r = 5, 

-1 2 

max SNR m - [233.3 0 3 (4-76) 

o < e < io° 

These last two expressions may be extrapolated to give the following 

general expression for the relationship between the maximum SNR., and the 
• N 

angle of rotation for small angular distortions. . 

, , 2 

max SNR^ s [46.65 r (0 •)] (4-77) 

o < e < io° 

This expression was corroborated by evaluating (4-77) for different 


values of r and testing whether the resulting value for the SNR., was 

N 

indeed a maximum. In all cases it was. 


Figure 4-9 provides a series of curves representing the v'SNR^ for a 
given integration area size as a function of the angular distortion. 
Figure 4-9a illustrates this relationship for a characteristic length of 


r =* 2, while a value of r - 5 was chosen for Figure 4~9b. Curves for 
2T = 50, 100, 200 and 400-are displayed for both figures. This represen- 
tation of the functional * relationship between the SNR^ and angular dis- 
tortion may be used for design of the registration processor in the 
following manner. Given the integration area and percentage loss in the 
SNR^ that is acceptable, the maximum allowable rotation may be found from 
the curve corresponding to the appropriate integration area. For example, 
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if the characteristic length is 5 and a loss factor of 0 o 8l for the SNR^ 
is acceptable when operating wi-th 2T - 200, then the maximum .a 1 lowable 
rotation is 0.15°. This is found directly from Figure 4-9b» A loss 
factor of 0 . 8 1 for the SNR., corresponds to a loss factor of 0.9 for 
7SNR N . For 2T — 200, the maximum /SNR^ without any angular distortion 
is.200, so that a. reduction of 0.9(200) equals 180. The maximum allow- 
able rotation Is that angle corresponding to a /s’NR of 180, which is 
0.15°. 

This concludes the .general examples for examining the effect of 
spati.al distortions -on the output signal-to-noise ratio. The illustra- 
tions presented in Figures 4-1 through 4-9 verified the .derived results 
ob.tained In section 4.2. It was "found that in the presence of a given 
linear scale or rotational distortion, there is a unique Integration 
area size which yields a maximum .output signal-to-noi.se ratio. Figures 
4-1, 4-5 and 4-6 illustrate this while figures 4-2, 4-3, 4-7 and 4-8 
.present a -straightforward way of determining the optimum integration area 
size and maximum SNR^ achievable. The next section proceeds iti a similar 
fashion -with spatial distortions modeling those observed for .LANDSAT I 
images. In .this way the method for applying the results of section 4.2 
rs illustrated for a practical Image registration model. 

4.5.3* Distortion Model for Temporal ly .Differ i ng LANDSAT 1 images. 

In this section the distortion model employed in the LARS registra- 
tion system [ 1,5-3 for overlaying LANDSAT I images is used to evaluate 
the expression for the SNR„ in the presence of spatial distortions. For 
mul ti' temporal LANDSAT I images a biquadratic polynomial of the following 
form Is used as. .the .distortion model. 
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X 1 


,+c n 

C 12 c 13 c l4 c 15 

Y* 

- _ 


_ C 21 

l+c 22 c 23 c 24 c 23 


" 2 

x 

2 

y 

xy 

where, 

(x,y) = reference image coordinate system 


(4-78) 


(x',y') = coordinate system of image to be regis'tere'd with the 
reference image - - 


and, . 

c.j = distortion coefficients; i = 1,2,; j = 

The values of the coefficients are determined by a least' squares procedure. 
The approach followed in the LARS registration system [ 1, 5 ] is to over- 
lay a sample of subimages from each full image assuming spatial congruence. 
This registration of each of the subimages is accomplished by a simple 
translation because of the assumption that no relative spatial distor- 
tions exist between the subimages. However, because- the full images are 
relatively distorted, all of the translations for the stibimage overlays 
are not the same. A least square estimate using a biquadratic polynomial 
(eq. (4-78)) then 'is used to find that spatial transformation between the 
full images which best fits all of the subimage translations simultaneous- 
ly. 

V/ith this model of the spatial distortions, the functions p(x,y) and 
q(x,y) used in the expression for the output signal-to-noise ratio become, 

2 2 

p(x,y) = c ]5 x + c ]2 y + c^x + c^y + c^xy (4-79) 

q(x,y) = c 21 x + c 22 y + c^x 2 + c^y 2 + c^xy 


(4-80) 
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Substitution of the'se expressions for p(x,y) and q(x,y) into equation 

(4-49) allows' for the evaluation of the SNR’^. In this case, since the 

speci f ic‘ di stort ion" i s' given-,- the S'NR^ depends upon two parameters: the 

2 

integration area, 4T ; and the characteristic length of’the'image auto- 
correlation function. Again evaluation of (4-49) was performed by using 
S impson 1 s ' rul e approximation to the integral.- 

Two sets- of distortion- parameters- were used in this evaluation of 
the SNR^ for LANDS'AT I imagery. The • coef f i c ient sets chosen were those 
used in the operational registration of images. In this way the method 
of using the analytical results of section 4.2 for. the situation en- 
countered in pra'ctice is exemplified. The LANDSAT I imagery registered 
and' corresponding distortion coefficients for the two overlays are listed 
in Table 4-4. 

The relationship between the vSNR^ and the integration area for 

different' values of the characteristic' length is displayed in Figures 

4—1 C^a- and 4— 10b. Fig'ure 4-l’0a contains' the results for the first set of 

coefficients .and Figure 4- 1 Ob for the second set. Note that for each 

curve in both figures there is a value of 2T yielding a maximum SNR... 

N 

This indicates that there is an optimum integration area where a maximum 
SNR^ is realized in the pre'sence of the distortion models chosen. 

in each figure the series of curves illustrates the dependence of 
the integration area size yielding "the maximum SNR^ on the characteristic 
length of the image' autocofre Vat Ion function. For example, when the 
first distortion model is used (Fig. 4-10a), a- value of 2T = 70' will 
give the maximum SNR^ for r = 2, Whereas for r = 5 a value of 2T = 1 80 
must be cho'sen. Therefore, choice of an optimum integration area size is 
determined by the value of r. 
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Table 4-4. LANDSAT 1 images registered and the 

corresponding distortion coefficients. 

Set 1 



LARS Run # 

Lines 

Col umns 

Reference run 

72053602 

(1350,2200) (1350,2250) 

Overlay run 

73070100 

(1450,2340) (1650,2700) 

Distortion coefficients 


c 1 = -0.05694005 

c 12 = 0.01516939 
c ]3 » 0.00001908 

c ^ = -0.00000280 

c = -0.00000445 

1 5 

Set 2 

C 21 = 
C 22 = 
C 23 = 
C 24 = 
C 25 = 

-0.02044661 

-0.08017300 

-0.00000123 

0.00001501 

0.00001334 

f 


LARS Run # 

Lines 

Columns 

Reference run 

72053602 

(1350,2200) (1350,2250) 

Overlay run 

75009000 

(1490,2340) (1425,2325) 

Distortion coefficients 


c n = -0.08289302 

c = -0.02288279 

c = 0.00000824 

c ^ » -0.00001332 

c_ = 0.00003550 
15 


C 21 
C 22 = 
°23 = 
• C 24 = 
°25 = 

0.01898512 

-0.11859888 

0.00000147 

0.00003922 

-0.00001333 
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a. Fi rst .spatial distortion model (set 1). 



0 400 

200 600 

2T 


b. Second spatial distortion model (set 2),. ' 

1 /2 

Figure A-10. (Normalized output slgnal-to-noise ratio) 
for different values of r vs. ‘(Integration 
1 /? 

area) (LANDSAT I images) 
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This dependence upon r required that an estimate of the image auto- 
correlation function be made so that a value of r could be determined. 

This was carried out by first picking several test sites from the images 
to be registered. - The subimage size chosen for each of these test sites 
was 111 lines by 111 columns. 

The next step in estimating the image autocorrelation function re- 
quired a preprocessing operation on the images. Since the examples in 
this section are applications of the analysis in section 4.2, it is 
necessary that the noise (temporal changes) be white. in Chapter 5 it 
is experimentally observed that the noise is nonwhite with an exponential 
autocorrelation function. In this situation it is necessary that the 
images first be passed through a filter designed to whiten the noise and 
then these pr.eprocessed images be registered (refer to Figure 3"4). 

With an exponential autocorrelation function for the temporal changes it 
is shown in the; example in Chapter 3 that a derivative type operator must 
be applied in' the preprocessing stage. In compliance with this analytical 
result a gradient operator was applied to each of the images, where, 


Gradient X.. .| = {(X . +1 - X. J _ 1 ) + (X. + , . - X.^ ; ) 2 } 


1/2 


• >J 


1+1 J M,j' 


(4-81) 


. X. . = image sample value at coordinate (i,j) 

• i,J 


The autocorrelation function estimate then was made from the gradient 
images. The following expression was used to estimate the autocorrelation 
function. 


. N-£ N-k 

R *,k jf, j (X i+*,j+ k “ X)(X i,j “ X) 


a - o, i , .. . , m 

k = 0 , 1 , o . . , M 


(4-82) 
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where, 

X = mean of the image 

X. = image sample value at coordinate -( i , j ) 

•'* J 

2 

N » number of data points In the image' 

M = maximum shift for the autocorrelation function estimate 

Figures 4-11 .through 4-15 contain examples of the resulting auto- 
correlation ^surfaces and contours of these surfaces. In Figures 4- 11a 
through 4-15a each of the autocorrelation surfaces are displayed, where 
each number denotes a value of the surface at the corresponding shift 
position and the (0,0) lag position is in the center of the d.i.sp-1 ayed 
surface. The scale for each of the surfaces has been normalized by the 
factor 100/R . 

'U f 

Although Figures 4— 11a through 4-1 5a .present the complete surfaces, 
the general shape of the surfaces is better illustrated by the .contour 

plots .in Figures 4— 1 lb through 4-15b. in these figures, contours at 

- 1 -2 

levels * R o;o e .and R 0 ’e are displayed. In this way It is a 
straightforward procedure to determine whether the surfaces are of 
exponential form, and if so, what is the characteristic length,, r. If 
the contours are equally spaced, then the surface is exponential and 
the 'characteristic length is the rdi stance between the contours. From 
these figures. .it is seen that an exponential model for the autocorrelation 
surface is -a reasonable model with a characteristic length ranging 
between .1 and 3. 

Using this range of r and the curves .presented - i n Figures '4-10a and 
-■4— 10b, the range for 2T yielding the maximum SNR^ is, 

40 < 2T < 110 for distortion coefficient set 1 
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AUTOCORRtLATlLlN 1 UNCTION ESTIMATE 
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of the gradient of the image. 
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Figure k-] 2b. 

Autocorrelation surface contour for the 




magnitude of the gradient of the image. 
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Figure 4~13a. Autocorrelation surface for the magnitude 
of the gradient of the image. 
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figure 4-1 3b. Autocorrelation .surface contour for the 



magnitude of the gradient of the image. 
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figure 4-l5b.. Autocorrelation .surface contour 'for the 
magnitude of the gradient' of the image. 
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and, 

30 < 2T. < 70 for distortion coefficient set' 2 
Therefore, it is possible to choose an optimum integration area size for 
the registration processor based upon the distortion model and image 
autocorrelation function characteristics observed for actual satellite 
images. 


4.6. Conclusion 

In the situation where relative spatial distortions exist between 
images to be registered, expressions have been derived for estimating 
the loss in the output s ignal -to-noi se ratio due to these spatial distor- 
tions. These results are in terms of a reduction factor (eqs. (4-18) and 
(4-41)) applied -to the SNR had the spatial distortions not been present. 
For distortions that are increasing with image size (eqs. (4-14) and 
(4-45)) there is a finite integration area th^t yields the maximum SNR. 
Determination of this integration area may be found by evaluating ex- 
pressions (4-12). and (4-39) with an appropriate model of the distortions. 
This evaluation is a straightforward procedure which may be accomplished 
by numerical integration methods as is shown in section 4.5. This is 
performed for bothgeneral linear distortions such as a scale change or 
rotation, plus two distortion models for LANDSAT I images. These latter 
examples illustrate the direct application of this analysis to practical 
image registration. 
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CHAPTER 5 

TEMPORAL CHANGE PROPERTIES 

tn one important application of image registration, one would like 
to match spatially as close as possible two temporally differing images 
of the same scene- so that they may be compared on a point by point basis. 
Assuming that there are no relative spatial distortions between the images, 
the registration process reduces to finding the relative translation between 
the images- However, since the images have been taken at different times, 
one can expect that changes in the scene have occurred, so that the two 
images will vary in the intensity levels as wel.l as their relative trans- 
lation. This variation in intensity levels contributes significantly to 
the uncerta i nt'y -in -f i ndi ng the relative translation. 

In the development of a registration processor, these changes have 
been modeled as additive noise. One image is assumed to be the signal 
and the second image the signal plus noise (temporal changes). 

Several approaches to the image registration problem have utilized 
statistical parameter estimation theory, where the parameters to be esti- 
mated are the translations along the respective coordinate axes (Chapter 
2) . A central part of this type of analysis is that one be able to 
characterize the noise properties, where the properties in question are 
determined by the particular approach that is taken. This is illustrated 
in Chapter 2 where an expression for the variance of the error of the 
registration processor is derived. Two lines of approach are presented, 
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both of whose val i ; di,ty. depepds upon certain assumptions. The first 
method requires knowledge of -£he probability density function of the 
noise and the .second assumes that the autocorrelation function or spectral 
densi ty of the. no.ise is known. 

These requirements prompted the development of a model of the temporal 
change characteristics. The particular properties of concern -are those 
that have been .encountered in the analysis and development of a registra- 
tion processor. These are the probability density function and the auto- 
correlation or spectral density function of the noise. 

‘ 5.1. Probability Density Function of the Temporal. Changes 

The-first .analysis is that of the probability density function of 
the noise. Since the. noise is defined as being the additive change that 
has occurred between registered • images , this investigation necessitated 
the registration of a. series of images and .then a subtraction of the 
image pairs to generate the >di f ference image, .or additive noise, for each 
time pair. -Test sites for thi s study .were chosen from LANDSAT imagery 
over .Kansas, Montana, Missouri, and In.diana, and are tabulated-in Table 
5-1. These. particular test .sites were picked because mul ti temporal 
imagery. that had been previously registered [35] was readily available. 

The probability density function of the noise for each test site 
was estimated by generating a histogram of each of the corresponding 
difference images .and then normalizing the ‘his.togramby dividing by the 
total number of points in the difference image to obtain an approximation 
to the probability density function. 
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Table 5~1.. Test sites for temporal change investigation. 
, a, Kansas 


LARS 
Run § 

73046000 

73064000 

74024100 

74024200 

Date Data 
Taken 

7/6/73 

8/29/73 ' 

5/26/74 

7/1/74 

Area 

f 

Line Column 
Center Center 

Line Column 
Center Center 

Line Column 
Center Center 

Line Column 
Center .Center 



T 1 1 445“ 

116 389 

294 393 

218 336 

1 


115 389“ 

293 392 

218 336 




294 392* 

218 336 



116 

.u 

570“ 

121 

514 

300 

518 

223 

462 

2 



121 

514* 

299 

518 

223 

463 






300 

.L 

518" 

224 

462 


21 1 

435" 

216 

380 

395 

383 

318 

327 



216 

380* 

395 

384 

317 

CM 





396 

384" 

319 

327 


111 1495“ 

120 1440 

297 

1447 

219 

1398 


120 l44o" 

298 

1447 

219 

1398 

- 

— 

298 

1447* 

219 

1398 


352 210* 

358 

154 

537 

157 

■459 

98 


358 

154 

537 

157 

459 

98 




538 

157* 

459 

98 



100 

?V 

170 

104 

113 

282 

116 

206 

57 

6 



105 

j - 

114" 

283 

116 

208 

58 






284 

A 

118" 

208 

.59 



loo 

* 

• 310 

104 

254 

282 

257' 

207 

199 

7 



104 

. * 
254 

284 

257 

207 

199 






284 

258" 

208 

200 
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•Table 5~la, cont. 


LARS 

Run # 730.46,000 73064000 74024100 74024200 


Date Data 

Taken 7/6/73 8/29/73 5/26/74 7/1/74 


Area 

# 

Line 

Center 

Col umn 
Center 

Line Column 
•Center Renter 

Line 

Center 

Col umn 
Center 

Line 

Center 

Col umn 
Center 



250 

■ * 

170 

255 ■ 

114 

434 

117'. 

356 

60 

8 



255 

BH 




57 







BE! 


59 



250 

ft 

310 

255 

255 

434 

258 

357 

200 

9 



255 

254* 

434 

257 

356 

199 






434 

258" 

mm 

200 



400 

360 ' 

407 

304 

586 

308 

50 7 

250 

10 



405 

304" 

584 

307 

505 

250 






584 

JL 

308" 

505 

250 



400 

JL 

510" 

407 

455 

587 

459- 

507 

403 

1 1 



405 

454* 

584 

458 

505 

402 






584 

4r- 

CO 

504 

401 


Reference location for corresponding line and column 
centers which are tabulated to the right. 
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5- lb. Hill County, Montana 


LARS Run Number 73124700 


LARS Channel 

# is 1-4 21-24 17-20 9-12 5-8 


Date Data - ■ ' , ' ■ 

• Taken 5/5/73 5/23/73 1 - • ~ 6/10/73 7/16/73 8/3/73 


Area Line 

# Center 

Col umn 
Center 

Line 

Center 

Col umn 
Center 

Line 

Center 

Column 

Center 

Li ne 
Center 

Column 

Center 

Li ne 
Center 

Col umn 
Center 


no 

410 ' 

no 

410 

1 10 

409 

109 

410 

no 

410 



110 

4 10” 

1 10 

409 

109 

410 

no 

410 

i 




1 10 

, * 
410 

109 

41 1 

1.10 

412 







1 10 

410* 

in 

410 

170 

1 30* 

170 

129 

170 

129 

170 

128 

170 

128 



170 

'k 

130 

170 

130 

170 

129 

170 

129 ■ 

2 




170 

k 

130 

170 

130 

170 

■ 129 







170 

k 

130 

170 

130 


415 

JU 

150“ 

415 

150 

415 

149 ‘ 

414 

149' 

415 

148 



415 

• 1 50 “ 

415 

149 

415 

150 

415 

148 

3 




415 

ju 

150“ 

415 

150 

415 

149 







415 

..u 

150" 

415 

149 


Reference location for corresponding line and column 
centers which are tabulated to the right. 
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55“lc. Missouri 
LARS -Run Number 72033804 


LARS 

# 

Channel 
i s 

1-4 

5~8 

9-12 

Date 

Taken 

Data 

9/13/72 

8/26/72 

10/1/72 

Area 

# 


Line Column 
Center Center 

Line Column 
Center Center 

Line Column 
Center Center 

- * * - ' 

1 


A 

'200 200 " 

201 200 

201 200 

1 



200 200 “ 

199- 201 


0 

200 

400" 


399 

201 

400 

2L 



200 

Ju 

400“ . 

199 

401 


200 

A 

600 “ 

202 

599 

201 

599 



200 

600 " 

199 

601 


200 

A 

800 " 

202 

799 

200 

800 



200 

800* 

199 . 

800 


200 

A 

lOOO" 

201 

1000 

200 

1000 



•200 

1000 5C 

200 

1000 


400 

200 “ 

401 

199 

401 

200 



400 

,200“ 

400 

200 



, 4 oo“ 

401 

399 

400 . 

400 



400 

400 “ 

400 

401 


8 

400 

x 

*600 

401 

600 

400 

600 



400 

A 

600“ 

4 oo 

601 


9 

400 

A 

-800“ 

401 

8oo 

400 , 

800 



o 

o 

-zr 

8Q0“ 

399 

800 


1 A 

400 

A 

1000“ 

401 , 

999 

400 . 

999 

1-U 



• 400 

1000“ 

399 . 

1000 


1 1 

600 

A 

200“ 

601 


600 

600 





> 599 . 

199 
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Table 5“lc, cont. 


LARS Channel 
# i s 

1-4 

5-8 

9-12 

Date Data 
Taken 

.. 9/13/72 

8/26/72 

10/1/72 

Area 

§ 

Line Column 
Center Center 

Line Column 
Center Center 

Line Column 
Center Center 


1 ? 

6oo ■ 4oo'' 

- 601 • 400 

" 600 400 



600' 400" 

599 400 


13 

. 600 

600* 

601 

599' 

600 

601 



600 

600* 

599 

601 


14 

600 

800” 

601 

800 

600 

8oo 



600 

* 

O 

o 

CO 

599 

800 


600 

J. 

1000" 

600 

1000 

600 

1000 



600 

1000“ 

600 

1000 


8oo 

200" 

800 

200 

800 

200 



800 

200 ' 

800 

200 


O 

o 

CO 

JU 

400 

800 

400 

800 

401 



800 

400* 

800 

400 


18 

8oo 

600 

800 

600 

800 

601 



800 

600'' 

799 

601 


19 

800 

CO 

o 

O 

801 

800 

799 

800 



800 

800 ” 

799 

800 


20 

800 

1000" 

801 

1000 

800 

1001 



800 

looc" 

799 

1000 


Reference location for corresponding line and column 
centers which are tabulated to the right. 




5- Id. Tippecanoe County, Indiana 
LARS Riifi Number 72053603 
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LARS 

# 

Channel 
is . 

]-4 

5- 

8 

9-12 

Date 

Taken 

Data 

9/30/72 

10/19/72 

11/29/72 

Area 

# 


Line Column 
Center Center 

Line 

Center 

Column 

Center 

Line Column 
Center Center 


1 


200 20.0 " 

•_ 199 

201 

199 203 



200 

200 

200 202 

• •• 

9 


200 400“ 

199 

401 

200 402 




200 

400* 

200 401 


200 

600* 

200 

601 

200 

602 



200 

600" 

201 

600 


200 

800 “ . 

200 

801 

199 

799 



200 

800 “ 

201 

801 


400' 

20 0" 

399 

200 

398 

201 



400 

Jt* 

200 

399 

200 



mmm 

400 

401 

399 

401 



4oo 

400“ 

399 

400 


400 

600“ 

400 

.600 

399 

600 



. 400 

600 " 

400 

600 


400 

. * 

800 . 

400 

800 

401 

800 



400 

800 ~ 

401 

799 


9 

600 

. 200 

. 599 

200 

599 

200 



. 600 

200 

599 

200 


i n 

600 

400" 

599 

400 

599 

399 

1 u 



600 

O 

O 

599 

399 

- . . . 

1 1 

600 

600 

600 

. 600 

599 

599 

i i 



600 

600 

599 

599 




Table 5“ld, cont. 


LARS 

# 

Channel 
i s 

1-4 

5-8 

9-12 

Date 

Taken 

Data 

9/30/72 

10/19/72 

11/29/72 

Area 

§ 


Line Column 
Center Center 

Line Column 
Center Center 

Line Column \ 
Center Center 


12 


600 800 “ 

600 800 

600 799 



600 800“ 

600 799 


Reference location for corresponding line and column 
centers which are tabulated to the right. 




Pr [Difference image -has value x] 


No. of points in the difference 

image having value x 

Total no. of points in the 
difference image 

These probab? 1 i ty . densj ty, funct ions were then plotted for a visual com- 
parison. For- the initial phase of this examination a sample size of 111 
lines by 1 1 1 columns- was chosen for each test site. 

Before exam! ning" the resulting probability density functions, first 
consider some of the types of density functions that one mights expect. 
Referring to Chapter- 2'where an expression for the variance of the regis- 
tration error is derived, the first method of approaching the problem, 
that, is, via a maximum- a posterior-i estimate of the translation parameters, 
requires- the assumption that the noise be normal ly- d i stributed. In the 
light of thris method, of analysis, a Gaussian distribution would be highly 
desirable. 

In previous analyses of mul t ispectral imagery,. -[36,37] an- image is 
modeled as being comprised of ’ di fferent- Homogeneous classes, each of 
which is- normally distributed.. Thus when considering two temporally 
differing images-,, the difference, image is composed of the change that has 
occurred for each of the different classes. Since each class, has a 
Gaussian distribution, the change for each-class is also normally dis- 
tributed-. This follows from the.' property, that the di fference of. two 
normally d istributed: data sets aiso has a Gaussian distribution'. 

One may formal i-ze. this line.-.of reasoning in the fol 1 owi ng, manner. 

Let D be the entire difference image, and 1 D. be the additive noise for 
the ith class. Since the noise for’ a particular class is Gauss-ian, the 
probability density function of D-. i.s, 
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D. 


(x) = 


v^tT 


exp {“ 


a. 


(x - y .)' 

~“2 

2a. 

i 


(5-2) 


where, 

p 0 M - P D ( x l D i) 
r 


(5-3) 


the probability density of the difference image given that one considers 
only the D.th class. The probability density of the entire difference 
image is then. 


p (x) = I p (x) Pr[D ] (5-4) 

u i j ' 

which is a weighted sum of Gaussian density functions, where Pr[D.] is the 
probability of occurrence of the D.th class and, 

2 PrtD.] » 1 (5-5) 

i 

Given this formulation, what are reasonable forms for p Q (x)? First 
consider the case where all the D. are identically distributed. Letting, 

a 2 = Var[D. ] 

V = E[D. ] 

then. 


R„(x) -~^r- exp {- 


Jn~ „ 2 (5-6) 

v2tt a m 2 a 

so that D has a Gaussian distribution. This simplistic assumption yields 
a straightforward and compact expression for the difference image density 
function, however, it is not a reasonable assumption in many instances, 
in these cases one must retain the weighted expression for p D (x). For 
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example, one class may decrease in' reflectivity over a period of time 
while a 'second -class may increase in’ i tS' reflectance. Underlying reasons 
for this will’ depend upon the type of imagery that. is being considered. 
With agricultural data, the- classes being different crop types, one type 
of crop may reach maturity, in a given time period while a second cover 
type has not changed ii.n appea ranee .- Another example would be- that of 
d ifferent- Harvest ing- t-ibies for >d ? fferent crops. Or if one i s= exami ning 
a scene containing a' -body of water, the changes over the water may be 
independent of those over the surrounding land, so that the two classes, 
water" and' land, wi‘1-1 not necessarily have the same amount of change over 
the same pe-riod of time. 

This' latter' formulation for the probability density function more 
closely coincides- wi-th- expert men ta'l observation, where' the density function 
is modeled' as a weighted sum of Gaussian density functions. This is borne 
out by examination' of the probability- density function estimate's- of the 
generated difference images. In conjunction with the examination of the 
histogram plots, also consider examples of the corresponding difference 
images. By observation of these images one- can obtain- a better feeling 
for the resulting probability density' function estimates. The particular 
examples given rep'rese’rit- a cross section of those density functions en- 
countered'. Figures $-T through' 5'- 5 -contain examples of the difference 
images- and their corresponding probability density function estimates. 

Note that they are categorized according to the observed density function 
estimates-. 

Figures 5“1 and 5-2 contain examples of difference images' whose 
probability density functions have a single mode. Referring back to the 
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a. Difference Image 


b. Probability density function 
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Figure 5“1. Difference image with single mode 
probability density function. 
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Figure 5“2. Difference image with single mode 
probability density function. 
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Figure 5~3. Difference image with dual mode 
probability density function. 
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Figure 5“ 4. Difference image with nondistinct 
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Figure 5 _ 5. Difference image with multimodal 
probability density function. 


equation form for the density function, this indicates that the mean of 
the noise for each of the classes is the same, i.e M 


Observe that while the field structure of the scene is visible, there is 
little differentiability between the fields in terms of the gray level 
representation of each. It is this non-uniqueness of the gray level in- 
tervals for each class that yields the single mode probability density 


Figure 5”3 illustrates the situation where the density function is 
dual modal. This type of density function is indicated in the difference 
image by the predominance of essentially two gray levels. Also note that 
the difference image contains the field structure of the scene, which 
supports the hypothesis that the temporal change is somewhat class de- 
pendent. 

Figures 5-k and 5-5 contain other examples of the types of density 
functions encountered. A multimodal example is presented in Figure 5 -i », 
and Figure 5“5 illustrates a case in which the modes are not separated. 

Although each of these examples differs in the type of density 
function observed, they all have a common factor. The basic field struc- 
ture of each of the scenes is still intact in the difference images. The 
conclusion one may draw from this observation is that the temporal changes 
are dependent in part upon the different classes within the scene. 
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5«2.- Autocorrelation Function of the' Temporal Changes 
In the earlier analysis (Chapter 2) it was found that a suitable 
model of the autocorrelation function was required in the derivation of 
a registration' processor. Both approaches to the problem necessitated 
knowledge of this nature. The probabilistic approach based upon the 
premise of normally distributed noise inherently requires a model of the 
autocorrelation’ function simply by the functional form of the probability 
density function. 

T ^ 72 ^172 ex P I !iE> <5-8! 

n_'=- noise; assumed zero mean here 
A =■ autocorrelation function (matrix) of the noise 
Note that the. autocorrelation matrix and autocovariance matrix are the 
same for zero. mean noise. 

One also comes across the need for an autocorrelation function model 
while approaching' the registration problem via the second method. The 
basic design criterion utilized in this approach is that the processor be 
a linear filter which yields a maximum output at the correct registration 
position in the absence of noise. In order to obtain the compact expres- 
sion for the .variance of the registration error, i .e.-, as the reciprocal 
of the output signal-to-noise ratio times the square of the effective 
bandwidth (Equations 2-46 and 2-47), a particular form for the processing 
filter was chosen, the matched filter, which maximizes the output signal- 
to-noise ratio (Equation 2-36). When one examines the expression for 
this type of ‘f Miter in the frequency domain, one finds that it depends 
upon the reciprocal of the spectral density function of the noise which 
is determined uniquely by the autocorrelation function of the noise. 
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Therefore, one again finds ithat knowledge of the .autocorrelation function 
of the noise is essential. 

Determination. of an approximate functional form for the autocorrelation 
function was carried out by experimentally estimating the autocorrelation 
functions of the series of difference images which .were generated for 
the probability density ’function estimates (cf. Table 5"1 for the areas 
used). -Since -the noise is modeled as being additive, it is just the 
difference between two registered images. The following expression was 
used to estimate the autocorrelation functions. Note that if is an 
asymptotically unbiased estimator. 


] N-£ N-k _ - 

j f 1 (x ij~ x)(x m,j+k" x) 

^ ” Oj 1> f o o o y L 
k ” 0 f 1 J O- • O y K 


(5-9) 


Where, 


.R„ , = autocorrelation function estimate at shift (£.,k) 

a,k 

X. . ^ (i,j) th element of the difference image 

1 yj 

x = mean of the difference Image 

L,K = maximum shift along the lines and columns respectively 

2 

•N = number of data samples in -the difference image,. 

The results obtained are best illustrated by viewing several examples 
of t-he estimated autocorrelation surfaces. Figures 5~6a through 5-1 la 
contain several different surface .estimates. The sample image size chosen 
is 111 lines by 111 columns with a maximum shift of 16 lines or columns. 
The amplitude of the surface is represented on a normalized scale, where 
zero on the scale corresponds to a value of zero for the autocorrelation 
function estimate', and the scale increment is 1.00/maxjR . j. 

y K 
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Figure 5-10a. Difference image autocorrelation surface. 
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Figure 5-1 la. Difference image autocorrelation surface. 
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Figure 5“T1 &-• Difference image autocorrelation" surface contour 
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One may now observe the characteristics of the surfaces , First note 
that the surfaces are smooth and nonincreasing for shifts around the 
central peako A violation of this nonincreasing property occurs only 
for small values of R^ k , and may be attributed to the property that the 
displayed surfaces are only estimates of the actual autocorrelation 
functions. Furthermore, by close examination, the surfaces appear to be 
exponential in nature. This last observation is best . i 1 1 ustrated by 
plotting equiampl i tude contours for each of the. autocorrelation surface 
estimates. Figures 5~6b through 5-1 lb contain such equiampl i tude con- 
tours for the corresponding surfaces, where the contour levels have 

— 1 _ O 

been chosen at R , R e , and R e . 

o,o 7 o,o * 0,0 

This particular choice of contours levels in terms of an exponentially 
decreasing function was prompted by the initial observation that the 
surfaces seemed to be exponential in form. Confirmation of this expo- 
nential property is achieved if the radial increment of the contours for 
each surface is a constant. From examination of these surface contours, 
it is seen that the radial increments are indeed approximately constant 
for each surface, so that the noise autocorrelation function is exponen- 
tial in nature. 

Now that one may reasonably model the autocorrelation surface of the 
temporal changes as exponential in form, one can use this model for the 
design of a registration processor. An example of an optimum processor 
based on an exponential autocorrelation function for the noise was given 
in Chapter 3- The reason for inclusion of this particular example in 
the previous chapter is now clear. It was in anticipation of the experi- 
mental observations that the example was chosen. It was presented in 
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the context of . i Hus t rating the method by which one solves for the processor 
based upon the noise autocorrelation characteristics. However, the example 
is directly applicable to the practical situation where an autocorrelation 
functionof an exponential nature is actually observed. 

Before applying the technique of the example to the experimentally 
determined .autocorrelation surfaces, it is necessary to carry out an 
additional step 0 The example presumed the following functional form for 
the autocorrelation surface. 


2 " a l T x l " 6 IV 

R(x ,t ) = A e ^ 


(5-10) 


Note that this requires 'the major axes of the correlation surface to co- 
incide with the x and y. coordinate axes. Unfortunately, the experimentally 
observed surfaces do not comply with this assunpti’on. However, this dis- 
crepancy is remedied quite easily by providing a linear spatial trans- 
formation to the correlation surface to adjust the major axes so that they 
are aligned with the x and y coordinate axes, solving for the filter 
function as is done in the example, and then applying the inverse of the 
linear spatial transformation to return to the original coordinate system. 
Again referring to the example solution (Equations 3“ 3 1 to 3“33), one 
finds that the pr-ewhlteni ng operation becomes a derivative type operator. 
This res.ult suggests that the performance of the registration processor 
may be improved by first preprocessing the imagery via a derivative type 
operator followed by a crosscorrelation operation instead of just cross- 


correlating the imagery directly. The experimental study discussed in 
the next chapter supports this hypothesis, where it is found that pre- 
processing the imagery via a gradient type operator (which i:s sa derivative 
operation) does increase the reTiabiTity of the registration .processor. 
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CHAPTER 6 

EXPERIMENTAL INVESTIGATION OF SIMILARITY MEASURES AND 
PREPROCESSING METHODS USED FOR IMAGE REGISTRATION 

6. 1 . introduction 

The study described in this chapter is the experimental examination 
of several different processors designed to overlay digital imagery (a 
more detailed-discussion of this study is presented in [35]). The 
impetus for such an investigation was provided by the development of 
several different registration algorithms that had evolved and been 
tested independently of one another [1 , 2 , 3 , 8 , 9 , 11 , 30 ] , thus leaving 
the potential user at a loss to objectively compare the different methods. 
This study is designed to allay this problem of choice by an experimental 
comparison of the basic techniques used in each of these algorithms to 
spatially match two temporally differing images. The approach taken is 
to record the performance of each of the techniques over a series of 
selected test sites where mul ti temporal imagery was readily accessible. 

In this way a quantitative measure of the performance of the various 
algorithms on a comparative basis would-be made available. 

The images used in this investigation were taken by the LANDSAT I 
satellite mu 1 1 i spectral scanner which operates in four spectral bands: 

0. 5-0.6 urn, 0.6-0. 7 ym, 0. 7-0.8 ym, and 0. 8-1.1 ym. Orbiting at an 

altitute of about 600 miles, the recorded data samples have a resolution 

\ 

of approximately '50 meters along the scanner sweep by 80 meters along 
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the satel 1 i te's path, so thaj: a full frame -consisting of about 2340 lines 

by 3240 columns covers an area^of about 100 by 100 miles. Mill ti temporal 

coverage of the same area i s- accompl ished by the orbital path of the 

f - 

satell ite .which’ cyclically repeats itself about every 18 days. 

Figures 6-V thro.ugh 6-4 contain several examples of mul ti temporal 
images over the same .area. The images have been -chosen from 'the general 
test site areas- used in this investigation and are' typical of .those 
utilized for the exper imenta-1 analysis. All of these- pi ctures were 
taken by the LANDSAT I mul ti spectral scanner and are in the 0..8-1.1 pm 
spectral band. Figure 1 6-1 shows: two images .over Tippecanoe County, 

Indiana, which were taken in September and November of 1972. .A scene 

\ 

from Hill County, Montana over .two times during the spring and summer 
seasons is pictured in Figure 6-2. An example of an area over a year's 
span is illustrated in Figure 6-3 wher.e data over western Kansas is 
shown’ for July, of 1-9,73 and 19.74. And two temporal differing .data sets 
over Missouri' are shown- i.n Fjgure 6-4. Notice that in all of the 
examples the areas for eaph time pair are recognizable as the* same, 
however, changes that have occurred- are evident., Also observe that the 
spatial scale of both images in each time pair appears to be the same, 
with little relative distortion. This property of the images is derived 
from the stability of the satellite viewing* platform which incurs minimal 
perturbations in its orbit. Such small fluctuations -in scanner- posi t ion 
over an area from one time to the next provides the approximate- spatial 
congruence between the temporally differing images. • 

This investigative comparison experimentally explores the basic . 
concepts which underlie these algorithms to provide an objective way of 
judging the performance of the different registration processors.. All 
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Figure 6-3. LANDSAT I imagery over Kansas. 
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Figure 6-*. LANDSAT I imagery over Missouri. 
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of the algorithms operate in the same fundamental manner. With the 
minimal relative spatial distortions between temporally differing LANDSAT 
I images, the first assumption made is that no relative, spatial distor- 
tions exist for small images. Therefore, registration of these small 
images requires only an estimation of the relative translation between 
the images. 'Gj-ven the two images to be overlayed, a search procedure is 
performed to flqd this relative translation. One image is shifted about 
over a larger temporally differing image and a measure of the similarity 
is computed at each shift position.' The translation at which this 
measure indicates the most similarity is designated as the registration 
position. This" is the fundamental procedure utilized in each of the- 
reg i strat ion ",al gori thms, however, the different methods depart from one 
another in the similarity measure employed and the type of images used. 

The first" part of the study examines the criteria used to measure 
the similarity between two images. This is an important part of the 
registration processor since the spatial matching of the imagery requires 
a quantitative measure of their similarity. Three different similarity 
measures, which are representative of those used in the algorithms of 
interest, are evaluated in this investigation. The first is the correla- 
tion coefficient which is the measure presently being used in the LARS 
registration system [1,5]. Second is the sum of the absolute values of 
differences, the measure utilized in an algorithm which comes under the 
heading of sequential similarity detection algorithms (SSDA's) [8,9]. 
Finally the correlation function, an unnormalized version of the corre- 
lation coefficient is compared. The expressions for each of these 
measures are contained in Table 6-1. Note the varying complexity of the 
computational requirements of each. The correlation coefficient requires 
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the operations multiplication, division, subtraction, and’ addition, while 
the correlation function uses multiplication and addition only. And the 
sum of the absolute? differences requires only subtraction and addition. 
These computational-requirements are reflected in the amount of opera- 
tional time neede'd’t’o evaluate each of the measures. 

Secondly, preprocessing of the imagery prior to -the actual registra- 
tion and - its effe'ct on the overlay results is examined. Several incentives 
prompted’ this- area of' investigation. The first is that of improving the 
performance of the registration processor,, and the second of reducing the 
operational time and storage allocation needed to implement the overlay 
algorithms. 

Two' approaches leading, to the same type of preprocess i ng; ope rat ion 
have- been suggested for improving' the performance of the registration 
processor over that when the original imagery 5s used,’. The, f i;rst line of 
reasoning; concern's the enhancement of the boundaries of an image. In 
many - types’ of scene's the basic- geometrical- structure of the scene is 
contained in the boundaries (e.g.,- agricultural scenes or imag.es contain- 
ing roads). Since registration Is a spatial matching of the images, it 
inherently uses- the geometric structure of the scene. Therefore, 
processing’ the images vis an algorithm which accentuates this- geometrical 
structure' prior to overlaying the images intuitively' suggests that an 
improvement is possible. One such method of performing this boundary 
accentuation is by a gradient type operator. This was the method proposed 
in several registration algorithms implemented by previous investigators 

£11*30], • - 

'Another approach to the use- of preprocessing for performance improve- 
ment is presented in 'Chapters 2, 3, 5 and Appendix A of this ’Investigation. 
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In Chapter 2 and Appendix A an optimum registration processor is designed 
utilizing' the statistical properties of the temporal changes, which are 
defined as additive noise in the context of parameter estimation theory. 

It is shown that use of a matched filter processor both maximizes the 
output signal-to-noise ratio and minimizes the variance of the registra- 
tion error. A me.thod for implementing this type of processor is given in 
Chapter 3 whereby a preprocessing operation (the prewhitening filter) is 
used. Therefore, this suggests using a preprocessing operation conforming 
to that which ‘is part of the matched filter processor. From the temporal 
image statistical properties observed in Chapter 5 and the example given 
in Chapter 3>Jt- is shown that this preprocessing operation utilizes a 
derivative type operator which may be approximated by the gradient 
operator suggested above. 

The second . type of preprocessing concerns reduction of operational 
time and storage' al location needed to .register two sets of images. This 
may be achieved' by converting the images to a' binary format (having 
intensity level values of only zero or one). In this way a storage 
savings is realized since each data sample has been converted to one bit 
of information. Secondly, operational time may be reduced by using 
logical operat ions as opposed to arithmetical operations in the computer. 

Three types of preprocessing were selected. The first is computing 
the magnitude :of the gradient of the images (equation 6-1). From a 
visual standpoint this accentuates the boundaries within the images. 

Plus, it Is a derivative type operator which is the optimum preprocessing 
operation derived in the example in Chapter 3 for temporal changes with 
an exponential. autocorrelation function, the model observed for the 
temporal changes in Chapter 5. The second preprocessing operation is 
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thresholding the images at their medians (all values greater than or equal 
to the median are set equal toaone, and all else set equal to zero). 
Finally, the-magni tude of the gradient of the images- is computed and then 
thresholded at an arbitrary level to be determined experimentally. Again 
this particular- choice was made to approximate the preprocessing methods 
that had- been proposed and implemented by other investigators. 

6.2. $imi larity Measures ’ 

An important decision that must be made in carrying out image regis- 
tration is what criterion should be used to evaluate the similarity 
between two images. That is, what similarity measure should be selected. 
The similarity measures, bei ng considered can be divided into two general 
classes. The first class provides a measure on an absolute scale. An 
example of this is the correlation coefficient which is the similarity 
measure presently being used In the LARS registration system [1,5]. The 
values of the correlation coefficient range between plus and minus one. 

A value of one indicates that the two images are identical or. differ by 
a positive constant factor about their means. A value of minus one 
indicates that the two images differ by a negative constant factor about 
their means. When using the correlation coefficient, the registration 
position is indicated by the maximum of its absolute value which is 
computed for each of the possible registration locations. It i.s necessary 
to consider the absolute value since it is possible that the temporal 
changes may cause a shift about the mean of the images which would result 
in a negative value for the correlation coefficient. Another feature of 
the correlation coefficient is that not only is its scale limited,. bu.t 
its value on that scale gives an indication of how good the images are 
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Table 6-1 V Equations for the correlation coefficient, correlation 
function, and sum of absolute values of differences 
- similarity measures. 


A. Correlation Coefficient, P^' 
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linearly related. The expression for the correlation coefficient is 
given in Table 6-1 . 

The second-class- indicates the registration position by a maximum 
or minimum value at the registration location. Two examples of this are 
the correlation function, which is an unnormalized version of the correla- 
tio'n coefficient, and the sum of the absol ute' val ues of the differences 
between the two’ images, the similarity measure used in a registration 
algorithm which comes under'the heading of sequential similarity detection 
algorithms (SSDA's) [ 8 , 9 ], The expressions for these similarity measures 
are- listed in Table 6 -i . For the correlation function, the registration 
position is indicated by a maximum or minimum value which is computed 
for each' of the possible overlay locations. For the sum of the absolute 
differences measure the registration position is indicated by a minimum 
value. In these examples there is no absolute scale, so that the value 
of this- maximum or minimum by itself will not give a good indication of 
how closely the two" images match". The exception to this occurs in the 
absolute value of the difference's case when the two images match 
perfectly-. However, if the difference between the two images is modeled 
as additive noise, a confidence interval can - be established in the 
absolute Value of the difference case b.y using the resulting minimum 
value in conjunction with the probabi 1 ‘ity distribution of the noise [9 ]. 

The choice that must be made w? th reg,ard to the s imi 1 ari.ty- measures 
is influenced By considerations- such as the following. (l) Hbw/well do 
the different methods perform? Is there a. way to theoretically predict 
this perfo'rmance,- and if so, wha-t are the- results? Also included in 
this question is whether there exists some sort of confidence measure so 
that the results may be evaluted quantitatively. (2) What operations are 



involved for each of the methods, and, what are the comparative times 
needed? (3) If it has been determined that several methods of registra- 
tion yield reasonable results with respect to the ability to find the 
correct registration position, then what are the tradeoffs between the 
accuracy and the time and number of operations involved? For example, 
if one method yields the. correct registration position in 95% of the 
attempts but requires twice the operational time as a method which is 
able to find the correct location 75% of the time, which method should be 
used? One criterion that is essential for this decision is whether the 
occurrence of a false indicated registration position is known to be 
false when it appears. 

For the experimental analysis, test sites were chosen from LANDSAT I 
imagery over Missouri and Kansas. Tables 6-2 and 6-3 contain listings of 
the dates the data were taken and the approximate location of the LANDSAT 
i frame centers for the data. A complete tabulation of these sites ?s 
given in [35]. The spectral bands chosen for this analysis were 0. 8-1.1 
ym for the Missouri data and 0.6-0. 7 ym for the Kansas images. The sub- 
images used to evaluate the registration algorithms were 51 lines by 51 
columns in size. Typical pictures of these general areas are shown in 
Figures 6-3 and 6-4'. 

Evaluation of the results is in terms of the percentage of acceptable 
registrations out of a given number of attempts. The nonacceptable 
attempts are those where the indicated registration location was known to 
be false. Such a criterion clearly requires some a priori knowledge of 
the relative translation between the images in question. For the Missouri 
imagery three temporally differing sets of data had been previously regis- 
tered to within a few pixels via the LARS registration system [35]. 
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Tpble'6-2„ Test, site area in Missouri. 

Approximate location of frame ^center: .Latitude: 37°24'N 


Longitude: 88°45'W 


*LARS ^Run Number: 72033804 

■Date 

Data Taken 

Corresponding Channels 
- in Run 72033804 

9/:l;3/72 . 

1-4 

;8/<26/72 

5-8 

10/1 /72 

9-1:2 


Table 6-3.. Testssite area in Kans-as. 

Approximate location -of frame center: -Lafi'.tude: 37°28'N 


.Longitude: 100°31 'W 

LARS Run 4 

Date 

Data Taken 

73046000 

.7/6/73 

73Q64000 

8/29/73 

74024100 

- 5/26/74 

74024200 

7/1/74 


ISSKoS: 



Therefore, any substantia] deviation from this was taken as an unaccept- 
able attempt. For the Kansas data this a priori information was supplied 
by careful visual checking of the imagery. 

The overall acceptability comparisons are listed in Table 6-4. The 
results are tabulated for both the original and preprocessed imagery so 
that a parti cuTar similarity measure may be cross referenced among the 
different types. of images registered. For example, with the correlation 
coefficient there is a 90% acceptability using the original images, 100% 
for the magnitude of the gradient of the images, 65% for the images 
thresholded at their median, and 90% with the magnitude of the gradient 
of the images thresholded at an appropriate level. 

Between -the three similarity measures examined, the correlation 
coefficient consistently yielded the highest percentage' of acceptable 
registrations. This is evidenced by the range of percent acceptabilities 
within each column-. For example, when the magnitude of the gradient of 
the images were registered, there was a 100% acceptability for the corre- 
lation coefficient measure, Jk% with the correlation function, and 92% 
acceptability with the sum of the absolute values of the differences 
measure. Therefore, on a performance-wise basis, these results indicate 
that the correlation coefficient should be chosen as the similarity 
measure. 

However, the question of the tradeoff between operational time re- 
quired and performance must still be examined. Is there a measure which 
reduced the reliability only slightly while accompanied by a large time 
savings? Refer -to the percentage acceptable registrations in Table 6-4 
for the magnitude of the gradient of the imagery. Note that while there 
was 100% acceptability using the correlation coefficient, there also was 



. fable 6-4. Percent (Number) of Acceptable Registration Attempts 


Original 


Magni tude 
of the 
Gradient 


Threshold'! ng 
at the 
Med i an 


Thresholding the 
Magnitude, of the 
Grad lent 


Similarity 

Measure 


- 



* -V. " 

Total 
Number of 
Attempts 

90 

66 

66 

30 

Correlation Coefficient 

90% (81) 

100% (66) 

653 (43) 

90% (27) 

Correlation Function 

38% (34) 


55 % (36) 

87% (26) 

Sum of Absolute Values . 
of Differences 

' 69 % (62). 

92% (61). 


87% (26) • 
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a 92 % performance -with the sum of the absolute difference measure. This 
result in conjunction with the reduction in the number of computations 
and thus, time savings achieved, by using this latter measure . (Table 6-1), 
indicates that in a time-performance evaluation, it might be more advan- 
tageous to use the sum of the absolute difference measure as opposed to 
the correlation coefficient. 

Overall, the best performance was achieved by the correlation co- 
efficient using the magnitude of the gradient of the imagery. Therefore, 
if percent acceptability is of prime importance, this preliminary com- 
parison indicates that preprocessing the imagery via a gradient type 
processor enhances the ability to find an acceptable registration position. 
The next section concerning the effects of preprocessing prior to regis- 
tration pursues this observation in more depth. 

6.3. Preprocessing Methods 

In the search for an optimum processor for image registration it 
has been proposed that preprocessing of the data prior to the actual over- 
laying procedure may be a step towards the solution of this problem. There 
are several underlying reasons for this suggestion. First, preprocessing 
may yield a greater reliability of the system's registration performance. 
This is supported by the analyses in Chapters 2, 3> 5 , and Appendix A. 

In Chapter 2 and Appendix A it is shown that the optimum processor 
utilizes the statistical properties of the temporal changes, in particular, 
the optimum processor is a matched filter which requires knowledge of the 
spectral density function or autocorrelation function of the temporal 
changes. Chapter 3 presents a method of implementing the optimum pro- 
cessor using a preprocessing operation which is analogous to the 




experimental investigation 'In this section- Therefore, once a model of 
the autocorrelation function irff .the temporal changes Is det-e'rmined, the 
preprocessing operation, cor responding to that for the optimum, processor 
may be found. In the .experimental investigation discussed in Chapter 5, 
the results indicated that an exponentially decaying' autocorrel at ion 
function is a reasonable mode.l for the autocorrelation function of the 
temporal changes.. When this model is used it is found in the-:example of 
Chapter 3 that a derivative type preprocessing operation will yield the 
optimum processor. Thus, implementation of .a preprocessing operation of 
this form should improve 'the registration performance. 

Secondly,, the time and operations -requ'i red may be substantially re- 
duced. An example of this is conversion of the original image into a 
binary image .(data values of only 0 or 1) so that logical operations may 
be employed in the .computer instead of arithmetical operations. 

The study undertaken here is an experimental examination of several 
preprocessing techniques and their effects on image registration. Three 
basic methods were .chosen. The first method utilizes the magnitude of 
the gradient of the imagery given :by, • • 

, 7 1/2 

I Gradient of X. .[ = {(X. . . - X. . .) 4 + (X. • - X. . .) } 

> »J '+1>J t,J“l (6_l) 

where X. . is the image intensi t-y at coordinate (i,j). Since the gradient 

i ,J -.5 

operation j.s a -derivative type operation, this method of preprocessing 
conforms to the optimum approach derived using the observed autocorrelation 
function of the temporal changes of Chapter 5 in the example of Chapter 3. 
Therefore, based on this analysis, use of the gradient preprocessing opera- 
tion should improve the registration processor performance. 
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The second method consists of thresholding the imagery at its median 
(all values greater than or equal to the median are set equal to one, and 
all else set equal to zero). And the third method computes the magnitude 
of the gradient of the imagery and then thresholds i t at an appropriate 
level . 

Typical images resulting from carrying out these preprocessing 
operations are shown in Figure 6-5. Figure 6-5a is the original image 
taken by the LANDSAT mul ti spectral scanner over Hill County, Montana. 
Thresholding the original image at its median results in Figure 6~5b. 

Note that although the thresholded image contains only two levels (0 and 
1), it represents the field structure of the scene quite well. 

The magnitude of the gradient of the image is illustrated in Figure 
6-5c. Note that boundaries between the fields have been accentuated by 
the gradient operation. This is the expected result. The gradient is a 
derivative type operator, so its magnitude at a point increases with the 
slope at that point. Since the boundaries of the scene indicate an 
increase in slope, the magnitude of the gradient at the boundaries is 
large. 

Figure 6~5d shows the resultant image after the magnitude of the 
gradient has been computed for the original image and then thresholded 
at an appropriate level. This is a binary image containing value of 
only zero and one. Again the basic field structure is represented quite 
well. 

LANDSAT imagery over Hill County, Montana, Tippecanoe County, 
Indiana, and Kansas were used for the analysis. The ready availability 
of mul ti temporal data prompted these particular choices. A listing of 
the dates the data were taken and the approximate location of the LANDSAT 
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a. Original LANDSAT I image over 
Hill County, Montana. 


b. Original image thresholded 
at its median. 



c. Magnitude of the gradient 
of the original image. 


d. Magnitude of the gradient that 
has been thresholded at an 
appropriate level. 


Original Image 

LARS Run # Date Data Taken Spectral Band Lines Columns 
73124700 5/5/73 0.8 - 1.1 ym (329,451) (80,206) 

Figure 6-5. Examples of images resulting from different 
preprocessing techniques. 


ilEPR 

oiug: 


UCIBILITY OF THE 

.L PAGE IS POOR 



149 


i frame centers are shown In Tables 6-2, 6~3» 6-5 and 6-6. - For a com- 
plete tabulation of the test sites refer to [35]. The actual subimage 
sizes that were to be registered for these comparisons were 51 lines by 
51 columns. 

Again, evaluation of the performance is in terms of the percent of 
acceptable registration attempts. Like the similarity measure comparisons, 
visual examination or previous registration to within a few pixels pro- 
vided the a priori information for determining the acceptability of an 
indicated registration position. Also, in order to provide a common 
basis for comparison, the correlation coefficient was chosen as the 
similarity measure for all of the attempted registrations. 

The acceptable-unacceptable attempts are tabulated in Table 6 - 7 . 

Note that the results have been divided into three sections: (l) the 
cases where the magnitude of the correlation coefficient (|p|) for the 
original imagery is greater than or equal to 0.5, (2) the ]p | for the 
original imagery -is less than 0.5, and ( 3 ) the overall results. The 
underlying reason' for this-parti tion is to examine the relative per- 
formance for the high correlation cases (jp| > 0.5) and the low correla- 
tion cases ( |p | < 0.5) separately, as well as for the overall results. 

First consider the overall results. Preprocessing the imagery via 
the magnitude of the gradient yielded the highest percent acceptability 
with 100%. Also, thresholding the magnitude of the gradient performed 
very well (97%). The important point to note, aside from the best per- 
formance, is that on an overall basis preprocessing of the imagery with 
a gradient type transformation boosted the performance over that utiliz- 
ing the original imagery. This result supports the analysis of Chapters 
2, 3, 5, and Appendix A where the optimum processor in the presence of 
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7/16/73 

8/3/73 


-1 7-20 
9- 12 
5 - 8 . 



Table 6-7. Percent (Number) of Acceptable Registration Attempts 


Original Ima 


Gradient 


the Median 



fp 1 > 0.5 
for 

Original Imagery 


|p | < 0.5 
for 

Original Imagery 


Overal 1 


100% (75) 75 


65% (37) 57 


100 % ( 75 ) 75 


100% (57) 57 


61* (35) 57 


( 112 ) 132 100 % ( 132 ) 132 


Thresholding the 
Magnitude of the 
Gradient 

Acceptable 

Total # 

Attempts 

Attempts 

100% (64) 

64 

89% (25) 

28 

37 % (89) 

92 
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exponentially autocorrelatgd temporal • changes .utilizes a derivative type 
operator In the pgeprocess mgs stage. .The analysis is also corroborated 
for the. . other. -simpar ? ty measures In Table 6-4. Comparison of the per- 
centage, of acceptable registrations for the gradient type preprocessors 
shows a - substantial . improvement in performance over both the original 
imagery,.and the images , thresholded at their median for each idf the 
simijari ty -measures. This Indicates that choice of a preprocessing 
operation .conforming -.to that derived via the matched filter (.Chapter 3) 
may ? ndeed -.provide a more,-rel iable registration processor. 

•Several questions may be asked about this observation. Is there 
any trend^to this increased reliability? Are there any image characteris- 
tics which seem to-.coiitri.bute to these observations? One answer to these 
questions is embodied . in .the partitioning of the overall results into 
the hi.gh-.and low correlation. .cases. 

Examination -.of 'the hi gh< correlation instances ( (p | > 0.5-'for the 
ori g inal piata) shows that dll of .the- preprocessing methods performed ex- 
ceedingly well wi th ;96%..acceptah|l i ty for thresholding the data. at Its 
median. and 100% for the rest. 'This indicates that when the original 
imagery is, highly correlated, any of the .preprocess ? ng methods .works 
equal ly wel 1 . In thi-s.case.no> advantage ‘is gained performance-wise by 
preprocess, ing the imagery pri.or to -registration. 

The -most striking -result came .with the low correlation cases (|p| < 
0.5 for the original data).. For ..these cases a marked advantage rover 
■using the original imagery .was obtained by preprocessing the data via a 
gradien.t type processpr. Use of .-the magnitude of the gradientcof the 
imagery provided a 100% acceptability compared with the 65 % performance 
for -the original data. Thresholding the magnitude of the gradient also 
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indicated a distinct increase in reliability. These results suggest that 
a substantial increase in the reliability of the registration processor 
may be achieved when the original imagery is not highly correlated by 
preprocessing the imagery prior to registration via an operation con- 
forming with the preprocessing operation derived from the matched filter 
configuration of the registration processor (Chapter 3 )* 

Earlier, ?t was mentioned that a priori information was used to 
determine the acceptability of indicated registration positions. For 
imagery that had not been previously registered this took the form of 
visual examination for an individual test site. Such a procedure is 
quite time consuming and does not lend itself readily to an automatic 
mode of operation. However, while attempting the registrations at the 
selected test sites it was found that relative spatial information could 
be used for several test sites located in the same general area, or the 
same test site for over several different times. For example, if several 
different test sites indicated the same relative translation for regis- 
tration, while the registration position of another test site within the 
same general area indicated a substantially different translation, then 
this latter registration attempt would be unacceptable. Similar reasoning 
follows for several time pair registration attempts for a single test 
site. 

Another observation which may be made directly from Table 6-2 also 
suggests a. way- by which a partial acceptability decision might be made 
automatically. This approach is in terms of an absolute scale confidence 
measure. Since the value of the correlation coefficient (p) indicates 
the linearity of the relationship between two images, possibly a range 
of values for p exists which could be used to determine acceptability. 
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This is suggested iin the first line of Table 6-7 where the results when 
the magnitude of' the correlation coefficient is greater than or equal to 
0.5 for the original imagery are listed. For the original data there is 
a 100% acceptability for this range- of p . This suggests that the value 
of the correlation coefficient may be used to help in the determination 
of the acceptability of an- indicated registration position. 

6.4. Performance of an 1 Operati onal Algorithm Which: - 
Utilizes Gradient-Type Preprocessing 

The -observation made in sections 6-2 and 6-3 suggested that an 
improvement" in the? performance of the registration processor could be 
realized by fi rst - preprocess ing the images via a.grad'ient type- processor 
and'-then reg is ter ing these, gradient images. It was also- founds that use 
of the correlation' coefficient ass. the. similarity measure yielded the 
highest percentage acceptability of the three- meas tires compared. inde- 
pendent of this experimental study, but at' approximately the same time, 
an algorithm designed toregister LANDSAT 1' images was developed at 
Computer' Sciences Corporation [30]^ which utilizes both a gradient type 
preprocessing of the images and’ a; s‘imi 1 ari ty measure closely approximat- 
ing the- correlation coefficient.- The availability of this algorithm 
made it possible to- experimental ly observe the extension of the= results 
obtained - in the similarity measure- and. preprocessing comparisons- to an 
algorithm designed for "operational image registration. 

The fundamental operation-of” this algorithm is the same asr. that for 
the other registration" processors compared in this chapter. The images 
are assumed to be- spatial ly congruent thereby reducing the registration 
operation to that of finding the relative' translation between the images. 
The translation is found by shifting one image (the. overlay image) over 
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a larger temporally differing second image (the reference image) comput- 
ing a value of the similarity measure at each shift position. The shift 
position at which the similarity measure indicates the best spatial 
match is taken as the registration position. In this case the images 
used for registration are the original images from two spectral bands 
that have been .passed through a gradient type processor, thresholded at 
an appropriate level, and then combined to form a composite image. The 
similarity measure used is an approximation to the correlation coefficient 
designed specifically to operate with binary images. 

The gradient type preprocessing operation is expressed as follows. 


X. . 


i+U 


- I. 


i-l,j 


i,j+l 



+ ' 1 i + 1 ,j-l “ 'i-lj+r + ‘‘i+lj+l " 1 i-1 , j-1 


( 6 - 2 ) 


where. 


I. . = original image intensity value at coordinate (?,j) 

1 > J 

X. . = intensity value of image sample at position (i,j) after 
1 '•* preprocessing 

After an image has been passed through this gradient type operation it 
is thresholded at the level which is exceeded by only fifteen percent of 
the data values. In this way the original image is converted to a 
binary image having values of only zero and one, with a prescribed per- 
centage of points having the value one. 

These gradient and threshold operations are applied to two spectral 
bands of each image set. The two binary images from each spectral band 
then are combined via a logical 'or' operation to produce a single image 
to be used for the registration, the resulting image containing between 
15% and 30% values of one. In this fashion it is possible to use 
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information from more than one spectral band simultaneously for the 
overlay processor.. 

The- stmi 1 ari t-y measure used to determine the relative translation 

between these preprocessed images is defined as the ratio of number of 

coincident points between the overlay and reference image having the 

value of' 1 divided~by the total number of points having value 1 in that 

portion of' the - reference image- being tested as the registration location. 

N' M- 

,=1 ;!] Y i+&, j+k 

P « lk ' N ' 

* 9 K y Y 

1 i+&, j+k 

Where, 

P M = value of the s imi la'r-i ty measure at shift (£,k) 

X. . = value (either 0 or T) of the overlay image at coordinate 
l,J (i,d) 

Y' . . , = va-lue (eithe’r 0 or l) of the reference image at coordinate 
1 K ( i-H, j+k) 

N = number of lines and' columns in" the overlay image 
Once the preprocessing has : been completed, two methods of 1 operation 
may be employed. The first approach is to' compute -ful ly the- value of 
the similarity measure at all of the shift positions. The second method 
involves- partial computation of p ; the vaTue of p., i s fully computed 
only if its estimated magni tude* exceeds a certain level. For a: com- 
plete- discussion of this latter approach refer to Nack [30] where the 
algorithm is discussed. Use of this latter method finds its advantage 
in terms of the time savings achieved by estimating p^ rather than com- 
puting it fully at all shift positions. 

This investigation entailed implementation of this algorithm over 
the same test sites- used for the s-imM ari ty measure-'and preprocessing 

REPRODUCIBILITY OF THE 
ORIGINAL PAGE IS POOR 
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analysis. Since the primary objective of this section is to relate the 
overall performance of this registration processor to the results obtain- 
ed in the previous two sections, the performance of the processor when 
p^ is fully calculated at all shift positions is discussed here. For a 
discussion of the results when the estimation procedure was also utilized 
refer to [35 ] . 

For a meaningful comparison the size of the test sites and the 
acceptability-unacceptability criteria remained the same. The test 
sites chosen covered all of those used for the similarity measure and 
preprocessing method comparisons. These were from Kansas, Missouri, 
Indiana, and Montana. The general areas are listed in Tables 6-2, 6-3, 
6 - 5 , and 6-6, while a complete tabulation of all of the test sites is 
given in [35].-. 

Since a single method of registering the images was tested, the 
performance resul.ts may be summed up in terms of the percent acceptable 
registrations out of the total number attempted. The overall tabulation 
showed that 190 out of 192 registration attempts were successful, which 
is a 99% success rate. This result is in close agreement with the 
previous findings, of sections 6.2 and 6.3, where preprocessing via a 

1 

gradient operator followed by use of the correlation coefficient yielded 
the highest performance (Table 6-4). This high performance rate also 
corroborates the analysis presented in Chapters 2, 3, 5 and Appendix A, 
where it is shown that preprocessing via a derivative type operator 
yields an optimum registration processor. 
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APPENDIX A 

PROOF THAT THE MATCHED FILTER MINIMIZES 
THE REGISTRATION ERROR VARIANCE 


In section 2.3 an expression for the variance' of the registration 
error is derived. The basic design criterion for this method of approach 
is that the second image (mis registered image plus noise) be passed 
through a filter whose output is a maximum at the correct registration 
position in the absence of noise. General relations for the variance 
are given by equations (2-27) and (2-28) where the variance may be 
evaluated by inserting a particular filter function. At that point the 
matched filter was used to evaluate these expressions, which leads to 
compact formulas for the variance of the registration error along the x 
and y coordinate axes. In this appendix it is shown that not only does 
the matched filter provide the maximum output signal-to-noi se ratio at 
the correct registration location and compact expressions for the variance, 
but it is the optimum filter in the sense that it minimizes the variance 
of the error. . , 

To begin the proof one starts with the general expressions for the 
variance of the registration error along the coordinate axes, equations 
( 2 - 27 ) and (2-28), which are repeated here. 




(A-l) 
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Note that these relations are greatly simplified when the term g equals 

xy 

zero. In this situation the variance expressions become. 


~ v 2 x 
(x— x) = “ 2 “ 


(A-3) 


_ n 

(y-v) = -f 


(A-4) 


«* 

it is convenient to. determine- the conditions under which the term g (x,y) 

xy 

does equai zero. Si nee-- g(x,,y) is modeled as a second order polynomial in 

x andt y about the true' registration location*, and is a maximum at this 

position', one may appl-y a linear spatia-T transformation to the (x,y) 

coordinate system so that in' the- new coordinate system, say (Xj ,y , ) , the 

term g (x,y) will equal zero. One may then solve' for the fiflter which 
x, y. 

minimizes, the registration* error variance in this new coordinate system. 
Once this filter is found,, the inverse linear spatial transformation may 
be applied to return to the orig.inal - coordinate- system. There-fare, no 

, , w *v 

generality' is lost by' assuming g^_y.(x,y) equals zero, so that ones; may begin 
with expressions (A~3 ; ) and (A-^) . 

The equations for the variance may be expressed in their equivalent 
integral form (equations' (2-29)', (-2—31) , (2r-32) , and (2-33)), 


o //// h(a,g)h(y,5)R (a-y,3-g)dad8dYdg 

(x-x ) 2 - 2 (A “ 5) 

[// h(a,S)' f (x-a,,y-g).dadg] 

XX 
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~ 2 //// h(o,p)h(y,g)R (a-y,3“ |)dadgdydg 

(y-y) ¥*—: j 

[// h(a,3) f (x-a ,y~3) dccdg] 


(A-6) 


where R(x,y) is th'e autocorrelation function of the input noise, f(x,y) 
is the known signal {first image), and subscripts again denote the partial 
derivatives with respect to the corresponding variables. 

Given these expressions for the variance, one would like to find the 


" A ^ 2 a 2 

filter h(x,y) which minimizes (x-x) and (y-y) . For this derivation the 
problem will be broken down into two parts, first the minimization of 
(x-x) , then of (y-y) <> 

To begin, first restate the problem in an equivalent form. Minimiza- 
tion of (x-x) may be stated in the following equivalent form. 

Minimize 


1 (h) “ //// h(a,g)h(y,C) R vv (a“T,6“5)dadgdYd5 (A-7) 

Subject to 

J (h) =' [// h(a,B) f (x-a,y-B)dctd6] 2 =’ (A-8) 

where is a constant and I (h) and J(h) are functionals of the filter 
h(x,y) . 


The method' of solution employed will follow that presented by 
Franks [18]. However, before solving the problem several notational con- 
ventions that are used must be defined. The first is that of an inner 
product and the second that of an operator. Given two functions, g(x,y) 
and h(x,y), the inner product, <g,h>, is defined by, 


<g,h>= // g(x,y) h(x,y) dxdy 

and an operator, A(x,y), on a function h(x,y) is defined as, 


(A— 9) 


Ah-// A(x-a,y-3)h(a,3)dadR 


(A- 10) 
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Note that A h is a functiorvyof the variables x-and y. 

With these integral representations, 1(h) and J(h) may be expressed 
in terms of operations in Hilbert space as 

i(h) = <R h, h> (A- VI) 

“ ' AX 

J(h) = <h.,s> (A- 12) 

where, 

s = s (x,y) = f xx (x-x,y~y) (A- 13) 

Since .each of the functions, R (x,y), h(x,y), and f(x,y) ar=e- real, and 

XX 

1(h) and J(h) are quadratic functionals, it is shown in Franks [18] that 

* 2 

the fi Iter which minimizes 1(h) subject to j(h) = is the solution to, 

VJ_ - X W = 2 (A-14) 

where A^ .is a- Lagrange multiplier and VJ_ and 7J .are the gradient vectors 
corresponding to 1(h) and J (h) . These gradient vectors may be found from 
evaluating the -di reeti.onal derivatives of 1(h) and J(h) which are defined 
as follows. 


1 im 

1 (h + eu) 

- r(h) 


D 1(h) - e -s-0 
u - 

e 

— = <21* U. 5, 

(A- 15) 

1 im 

J(h + £.u) 

- J(h) 


D J(h) = s+,0 

u - 

e 

— = <VJ,u> 

(A- 16) 


where, 

D1 (h) = directional -derivative of 1(h) with respect to u 
D^J(h) = directional derivative of J(h) with respect to u 
and u is an arbitrary function w,i-th the property that, 

<u,u>-= // -u ‘(x,y)dxdy » 1 (A- 17) 

Substitution o.f equations (A-11) end (A-12) into (A-15) and (A-16), and 
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using the inner product properties for real 

functions g(x,y), h(x,y) and 

A(x,y)., with A(x,y) an even function, 


<g,h> ~ <h,g> 

(A- 18) 

<A g,h> = <A h,g> 

(A- 19) 

yields, 

■ 

<VI , u> ~ <2 R h, u> 

— - -xx- - 

(A- 20} 

<VJ, u> = <2 <h,s>s, u> 

(A— 2 1 ) 

or equivalently. 


<U_, = < 2 5 XX !]> y) 

(A-22) 

<VJ, u > = ( 2K, s, u) 

(A-23) 


since <h,s > - Kj . 

From these expressions one obtains the gradient vectors, 


VI = 2 R h (A-24) 

— — xx— 

VJ = 2 K s (A- 25) 

Then from equation (A-1*0, one must solve, 

2 R h - L2K. s = 0 (A- 26 ) 

-xx- 11- 

Rewriting this in integral form, 

2 // R xx (x-cx,y-|3)h(a,|3)dadf3 - 2X ] Kj f xx (x-x,y-y) = 0 (A-27) 

so that, 

// R xx (x“a,y-0)h(a,g)dadg = A^ f xx (x-x,y-y) 

The solution to this equation is found by taking the Fourier transform of 


both sides. 
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//// R (x-a,y-g)'h(a,p)e J ' 2TT ^ UX+vy ^dcidgdxdy 

XX 

- A ] K 1 /jf xx (x-x,y-'y)e" J * 2 lT ^ UX+V X^dxdy (A- 28 ) 

Which becomes, 

ri*u 2 u 2 S m (u,v) H( u >v) - -4ir 2 u 2 XyK^F (u,v)e J 2lT ^ u+ Y v ) (A-29) 

. where, 

S m (u,v) = Fourier transform of R(x,y) 

H(u,v) = Fourier transform of h(x,y) 

F(u,v) = Fourier transform of f(x,y) 

Rearranging this expression in terms of the filter, H(u,v), one obtains, 


. , „ F"(u.v)a"j 2T(;u ^ v) 

H(u * v) " x i K i s T i i. V) 


(A-30) 


m 


which is -the definition of the matched filter multiplied by an arbitrary 
constant factor, Thus, the filter which minimizes the variance 

along the x-axis is the matched filter. 


Alternatively, minimization of (y-y) may be done in the same manner 
where the' problem is equivalently stated as., 

Minimize 


n y “ //// h(a,8)h'('Y»?)8yy(a"Y,g-g)dadgdYd5 


(A-31) 


Subject to 


0 r* O' 0 

Syy = !•// h ( a »6)f y y(x-a,y-g)dadg3 = K- 2 , a constant (A-32) 

Since the problem i'S now in terms analogous to the minimization; of 
(x-x) , one may follow the same steps which' result in the fol Towing 


solution for H(u,v). 
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, . .. F , '(„.v)e' J2,l(;:u+ > ;v) 

“• v! - ' A 2 K 2 


(A-33) 


where < s the corresponding Lagrange multiplier,, Therefore, the filter 

A ** 2 * 

which minimizes (y-y) is the matched filter. 

Since the constants and K 2 are arbitrary, one may choose Kj and 

^ ^ 2 ^ 2 

K 2 such that 1 = A 2 K 2‘’ ThUS and ^y-y) are minim ' 2ed 

simultaneously by using, 

' , FNu.v) e " J2l(;:,J+ ^ > 

' H(u >' ,) s - ('u7v) 

m 


(A-3^) 


which is the matched filter. Therefore, use of the matched filter not 
only maximizes the output signal-to-noise ratio at the correct registra- 
tion location and yields compact expressions for the variance, but it 
also minimizes the variance of the registration error. 



